DEPARTMENT OF CIVIL & ENVIRONMENTAL ENGINEERING 
COLLEGE OF ENGINEERING & TECHNOLOGY 
OLD DOMINION UNIVERSITY 
NORFOLK, VIRGINIA 23529-0162 

NEW PARALLEL ALGORITHMS FOR STRUCTURAL 
ANALYSIS AND DESIGN OF AEROSPACE STRUCTURES 


By 

Dr. Due T. Nguyen, Principal Investigator 
Department of Civil & Environmental Engineering 

FINAL REPORT 

For the period ending June 30, 1998 

Prepared for 

NASA Langley Research Center 

Attn.: R. Todd Lacks 

Grant Officer 

Mail Stop 126 

Hampton, V A 23681-0001 

And 

NASA Langley Research Center 

Attn.: Olof O. Storaasli 

Technical Officer 

Mail Stop 240 

Hampton, V A 23681-0001 

Under 

NASA Grant NAG 1-858 
ODURF Project No. 184047 

Submitted by 

Old Dominion University Research Foundation 
800 West 46* Street 
Norfolk, VA 23508 


October 1998 



ABSTRACT 


Subspace and Lanczos iterations have been developed, well documented, and widely 
accepted as efficient methods for obtaining p-lowest eigen-pair solutions of large-scale, practical 
engineering problems. Tne focus of this paper is to incorporate recent developments in vectorized 
sparse technologies in conjunctions with Subspace and Lanczos iterative algorithms for 
computational enhancements. Numerical performance, in terms of accuracy and efficiency of the 
proposed sparse strategies for Subspace and Lanczos algorithms, is demonstrated by solving for 
the lowest frequencies and mode shapes of structural problems on the IBM-R6000/590 and Sun- 
Sparc 20 workstations. 


1. INTRODUCTION 

The finite element method has been used successfully for the solution of many practical 
engineering problems in various disciplines, such as structural analysis, fluid mechanics, 
structural optimization, heat transfer etc. [1-5]. Essential to the finite element solution of these 
problems is an effective numerical procedure for solving large-scale, sparse systems of linear 
equations and generalized eigen-equations. These solution phases typically represent the most 
costly step of the analysis in terms of computational resources. 

Subspace and Lanczos iterations have been developed, well documented, and widely 
accepted as efficient methods for obtaining p-lowest eigen-pair solutions of large-scale, practical 
engineering problems [6-14], The focus of this paper is, however, to reexamine these 2 popular 
eigen-solution algorithms, with the viewpoints to incorporate recent developments in vectorized 
sparse technologies in conjunctions with Subspace and Lanczos iterative algorithms for 
computational enhancements. Basic subspace iteration algorithm is reviewed in Section 2. Key 
steps in Lanczos eigen-solution algorithm is summarized in Section 3. Major computational tasks 
in Subspace and Lanczos iterative algorithms are identified in Section 4. Computational 
enhancements using vectorized, sparse strategies are discussed in Section 5. Numerical 
evaluations of the proposed sparse algori thms , and the developed software are demonstrated in 
Section 6, through practical finite element models. Finally, conclusions are drawn in Section 7. 
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2. BASIC SUBSPACE ITERATION ALGORITHM [I, 6-9] 

The generalized eigen-equations, in matrix notation, can be expressed as 

[*] [4>] = [M] [<j>] [A] (1) 

In Eq. (1), matrices [K] and [M] represent the strucmral stiffness and mass, respectively. 
Matrices [A] and [4>] represent the eigenvalues and eigenvectors, respectively. The dimension 
(or degree-of-ffeedom) of matrices in Eq. (1) is N. For many practical engineering applications, 
[K] is symmetrical and positive definite. Subspace iteration algorithm can be used effectively to 
obtain the lowest p eigen-pair solutions. The algorithm can be conveniently described by the 
following step-by-step procedures: 


Table 1: Step-by step Basic Subspace Algorithm 

Step 1 : Select the starting iteration vectors [YJ Nut where q < < N 

Step 2 : Factorize the strucmral stiffness matrix 

m = [L][D][L] t (2) 

In Eq. (2), [ L ] is the lower triangular matrix, and [ D ] is the diagonal matrix 

Step-3 : For k = 1,2 Maxiter, where Maxiter represents the input maximum number of iterations, 

the following tasks need to be done 


Step 4 : Solve [0^, ] NxlJ from the following matrix equations 



(3) 

Step 5: Compute the reduced stiffness matrix 



(4) 

Step 6: Compute the reduced mass matrix 


[f,.,],., - [MW®, 

(5) 


(6) 

Step 7: Solve the reduced eisen-equations 


[*ViW2, 

(7) 


The eigenvalues (Q 2 ^,] and the associated eigenvectors [ Q k+l ] need to be arranged in the ascending orders 
(for example Qr x < 0% < < ... ) 

Step 8 : Find an improved approximation to the eigenvectors 

( 8 ) 

Step 9 : Check for convergence. The iterative process will be stopped if either convergence is achieved, 
or the maximum number of iteration ( = Maxiter) is reached (or else, return back to step 3). 
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3. LANCZOS ALGORITHM [1,3,10] 


' S 


Recently, the Lanczos algorithm for the solution of generalized eigenvalue problems has 
been receiving a lot of attention due to its computational efficiency . The original, generalized 
eigenvalue equation can be written as: 

K (J) = co 2 M<]> (9) 


or 

KJ = co><{) (10) 

where K and M are structural stiffness matrix and mass matrix, respectively, K^= K - oM, a is 
the shift value and co 0 2 =ur - a 


Instead of solving Eq. (9), or Eq. (10) directly, the Lanczos algorithm generates a tri-diagonal 
matrix T„, 

«, P: 

P: P 3 

P, -3 


r = 

i 

p. 

p. a . 

through the following three-term recurrence: 

r = p q = K~ l Mq -aq -Q q . 

or in the matrix form: 

[K~ l M]0 -OT = {0.0....r } = r e T 

u 0 J ~ ~ mm 1 mm 


(11) 


( 12 ) 

(13) 


^ = 9* (14) 

where e T m = (0,0, ...1), Q m is a Nxm orthogonal matrix with columns qj = 1,2,3 ...m, and m is 
usually much smaller than N. By solving the following reduced eigensystem the eigensolution 
of Eq. (10) can be obtained as 


<t> = e„-- 


(16) 


For most structural engineering problems, only a few lowest frequencies and the 
corresponding mode shapes are required, so we have m < < N which leads to a significant 
savings in the number of operations. 
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A partial restoring orthogonality scheme and a convergence criterion are developed and 
incorporated into the basic Lanczos algorithm, which is described in a step-by-step procedure, 
shown in Table 2. 


Table 2: Step-by-Step Basic Lanczos Algorithm 


Step 1. 

Factorization :K, = LDL 

T 


Form starting vector: y 0 * C 

q» = o 

Step 2. 

Compute: M yo 


Step 3. 

Compute : 



Pi = ho TM io ; ?, 

^7|cCL 

II 

Step 4. 

Compute : Pj = Mq, 


Lanczos iteration 



Forj = 1,2,3, .... do 


Step 5, 

e,= K.' : P, 


Step 6. 

6 . = V Pj 


Step 7. 

“i = V M 6 j = p , T5 i 


Step 8. 

Yj = V a j4j 


Step 9. 

Aj = M Yj 


Step 10. 

1 



Py-i = (Y/Wj ) 2 = V 

J * J 


Reorthogonalization of q^ 1 


Step 11. 

y. 

A. 


q . - 1 ; P = 

) 


71 p,-, 

P;*. 

Step 12. 

IF necessary solve Eq( 14) 

: T.z = 0z 


Converged? ( If "No", then return to 

Step 13. 

Eigenvector transformation: 4> = QjZ 


Reorthogonalization of Lanczos Vectors 


Various reorthogonalization schemes have been developed to increase the efficiency of 
Lanczos algorithms [10-14], However, for very large problems where factorization, 
forward/backward substitution and matrix-vector multiplication are the major operations, the cost 
of reorthogonalization becomes less important than for small problems, since only a few lowest 
eigenpairs are desired. In this work, a simple way of reorthogonalization is adopted. 


First for any new Lanczos Vector qj, calculate 

E i =q‘Mq j (i =1,2 j- 1) (17) 

If Ej > E, then q, should be orthogonal to qj with respect to, M where E is a parameter related 
to the machine parameter E 0 such that 1+E 0 > 1. Usually, E is taken as: 

E = JE. (18) 

Eq. (18) is called semi-orthogonality [12] condition. 


4 



Convergence Criterion and Error Norm Check 

One major advantage of the Lanczos algorithm lies in their ability to terminate the iteration 
process as soon as the required eigenpairs have converged. In this work, the following error 
bound for eigenvalues is used (after solving Eq. 14 in step 12) 

X - 0 Z (,) 

ERROR (/) = | — - — i| = |P — | where i = l,2,....j (19) 

U 0 

i / 

In Eq. (19), X K is the k-th exact eigenvalue and is the i-th computed eigenvalue. Z, 0 ' is the 
j-th element of vector Z co . If ERROR(i) < RTOL, for i = 1,2 ..p (where RTOL is a user's 
specified tolerance, and p is the number of eigenpairs to be extracted) then the Lanczos iteration 
is considered to be converged and the program begins to perform the eigenvector transformation 
accordingly (see step 13 of Table 1). 

4. MAJOR COMPUTATIONAL TASKS IN SUBSPACE ITERATIONS .AND 

LANCZOS ALGORITHM 

Careful observations on the subspace iteration, and Lanczos algorithms indicate that the 
following major computational tasks are required: 

Major task 1 : Matrix factorization (see' step 2 of subspace iteration, and step 1 of Lanczos 
algorithm). 

Major task 2 : Forward and backward equation solutions (see step 4 of subspace iteration, and 
step 5 of Lanczos algorithm). 

Major task 3 : Matrix-Vector (or Matrix-Matrix) multiplications (see steps 5,6 & 8 of Subspace 
iteration, and steps 2,4,7,9,10 & 13 of Lanczos algorithm). 

Computational enhancements in conjunction with the above major tasks will be discussed with 
great details in the next section. 


5. COMPUTATIONAL ENHANCEMENTS FOR SUBSPACE AND LANCZOS 

ALGORITHMS 

It has been pointed out in Section 4 that matrix factorization, forward & backward 
equation solution, and matrix- vector (or matrix-matrix) multiplications represent the major 
computational lasts for Subspace iteration, and Lanczos algorithms. Recent developments in 
Sparse technologies [15] will be fully utilized to improve the computational efficiency of both 
subspace iteration, and Lanczos algorithms. 

LDL T Algorithm 
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*s 


The Choleski (or LFU) factorization is efficient, however its application is limited to the 
case where the coefficient stiffness matrix [K] is symmetry and positive definite. With negligible 
additional computational efforts, the LDL T algorithm can be used for broader applications 
(where the coefficient matrix can be either positive, or negative definite). In this algorithm, the 
given matrix [K] in Eq. (1) can be factorized as 

[£] = [ 1 1 [£] [L] T ( 20 ) 


Where [L] and [D] are lower triangular matrix (with unit values on the diagonal), and diagonal 
matrix, respectively. For a simple 3X3 symmetrical stiffness matrix, Eq. (20) can be explicitly 
expressed as 



*.2 

*13 


1 

0 

0 


’ 0 . 

0 

0 


’ 1 *2. 
0 1 


*2, 

K 22 

*23 

= 

*21 

1 

0 


0 

D 2 

0 


£32 

1 

*31 

K n 

*33] 


.*3, 

^32 

1 


0 

D. 

1 



0 0 


The unknown and D ; can be easily obtained by expressing the equalities between the upper 
triangular matrix (on the left-hand-side) and its corresponding terms on the right-hand-side of Eq. 
(21). Since the LDL T algorithm will be used later on to develop efficient, vectorized sparse 
algorithm, a pseudo-FORTRAN skeleton code is given in Table 3 (assuming the original given 
matrix [K] is symmetrical and full). 


1. C Assuming row 1 has been factorized earlier 

2. Do 11 I = 2, N 


3. Do22K= 1,1-1 

4. C Compute the multiplier (Note : U represents L T ) 

5. XMULT = U(K,I) / U(K,K) 

6. Do 33 J = I, N 

7. U(I,J) = U(I,J) - XMULT * U(K.J) 

8. 33 CONTINUE 

9. U(K,I) = XMULT 

10. 22 CONTINUE 

11. 11 CONTINUE 


Table 3: Skeleton FORTRAN Code For LDL r 
(Assuming the matrix U is completely full) 

As an example, implementation of the LDL T algorithm, shown in Table 3, for a given, 
simple 3x3 stiffness matrix 


[K] 


2-10 
-1 2 -1 

0 -1 1 


will lead to the following factorized matrix 


( 22 ) 


6 





(24) 


(25) 


Storage Schemes for the Coefficient Stiffness Matrix 

Successful implementation of a sparse equation solution algorithm depends rather heavily 
on the reordering method used. While the Reversed Cuthill-McKee (RCM), or Gipspoole- 
Stockmyer (GS) ... reordering algorithms can be used effectively in conjunction with skyline or 
variable bandwidth equation solution algorithms [16-18], these reordering algorithms are not 
suitable for sparse equation solution algorithm. Designing efficient sparse-reordering algorithms 
is a big task itself, and is outside the scope of this paper. For complete treatments on this subject, 
the readers are strongly recommended to popular textbooks and articles in the literature [16, 19- 
20], In this section, it is assumed that the best available sparse-reordering algorithm, such as 
Modified Minimum Degree (MMD), or Nested Di-section (ND) [16], has already been applied 
to the original coefficient matrix [K]. To facilitate the discussions in this section, assuming the 
resulted matrix [K] (after using MMD, or ND algorithm) takes the following form 
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( 26 ) 


[*] 


0 . 

1 . 

0 . 

2 . 

0 . 

0 . 

3 . 

0 . 

66 . 

0 . 

4 . 

0 . 


88 . 

5 . 

0 . 



110 . 

7 . 




112 


For the data shown in Eq. (26), it can be easily shown that the factorized matrix [U] will have 
the following form: 


[tf] 


x 0 0 x 0 x 

x 0 0 x 0 

x 0 x 0 

x x F 
x x 

X 


(27) 


In Eq. (27), the symbols "X" and "F" represent the nonzero values after factorizadon. However, 
the symbol "F" also refers to "Fills-in" effect, since the original value of [K] at location has zero 
entry. 

For the same data shown in Eq. (26), if "skyline” equauon solution is adopted [21], then 
the ''fills-in" effect will take the following form: 




0 0 x 0 x 

r 0 F x F 

x F x F 

x x F 

X X 
X 


(28) 


On the other hand, if "variable-bandwidth" equation solution is adopted [22], then the "fills-in" 
effect (on the data shown in Eq. 26) will have the following form: 




x F F x F x 

x F F x F 

x F x F 

x x F 

X X 

X 


(29) 


Thus, for the data shown in Eq. (26), the "sparse" equation solution is the best (in the 
sense of minimizing the number of arithmetic operadons, and the required storage spaces in a 
sequential computer environment) and the "variable-bandwidth" equauon solution is the worst 
one! 
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For practical computer implementation, the original stiffness matrix data, such as the one 
shown in Eq. (26), can be represented by the "sparse formats" as following: 


ISTARTROW 


1 1 


' I 

2 


2 

3 


3 

4 

► = < 

4 

5 


5 

6 


6 

it 

+ 


1 


(30) 


ICOLNUM 


1 


i ' 

2 


2 

n 

J 


j 

4 


4 

5 


5 

6 -NCOEF 


6 


DIAG 


1 1 


11. ’ 

2 


44. 

-> 

> - i 

66. 

4 


88. 

5 


110. 

■6 =N 


112 


1 


AK { 



6 = NC OEF 



(31) 


(32) 


(33) 


The following definitions are used in Eqs. (30-33): 


• N: 

• NCOEF: 

• ISTATROW: 

• ICOLNUM(j): 


• DIAG(i): 

• AK(j): 


Size of the original stiffness matrix [K]. 

The Number of non-zero, off-diagonal terms of the original stiffness 
matrix 

Starting location of the first non-zero, off-diagonal term for the i :h row 
of [K] . The dimension for this integer array is N + 1 . 

Column numbers associated with each non-zero, off-diagonal terms of 
[K] ( in a row-by-row fashion). The dimension for this integer array 
is NCOEF. 

Numerical values of the diagonal term of [K] . The dimension for this 
real array is N. 

Numerical values of the non-zero, off-diagonal terms of [K] (in a row- 
by-row fashion). The dimension for this real array is NCOEF. 
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Sparse Symbolic Factorization 


The purpose of symbolic factorization is to find the locations of all nonzero (including 
"fills-in" terms), off-diagonal terms of the factorized matrix [XJ] (which has NOT been done vet!). 
Thus, one of the major goals in this phase is to predict the required computer memory for 
subsequent numerical factorization. The outputs from this symbolic factorization phase will be 
stored in the following 2 integer arrays (assuming the stiffness matrix data shown in Eq. 26 is 
used): 


JSTARTROW 


1 ’ 


1 

2 


*1 

3 


4 

4 

► = < 

5 

5 


7 

6 


8 

7=jV>] 


8 


(34) 


JCOLNUM 


1 


4 

2 


6 



5 

'■ 4 


5 

5 


5 

6 


6 

7 =NC OEF2 


6 


(35) 


The following "new" definitions are used in Eqs. (34-35): 


•NCOEF2: 

•JSTARTROW(i): 

•JCOLUMN0: 


The number of nonzero, off-diaaonal terms of the factorized matrix 

[U] 

Starting location of the first nonzero, off-diagonal term for the ith 
row of the factorized matrix [U]. The dimension for this integer 
array is N + 1 . 

Column numbers associated with each nonzero, off-diagonal terms 
of [U] (in a row-by-row fashion). The dimension for this integer 
array is NCOEF2. Due to "fills-in" effects, NCOEF2 > > NCOEF. 


The "key" steps involved during the symbolic phase will be described in the following 
paragraphs: 


Step 1 : Consider each ith row (of the original stiffness matrix [K].) 

Step 2 : Record the locations (such as column numbers) of the original non- zero, off- 
diagonal terms 
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Step 3: Record the locations of the "fills-in" terms due to the contributions of some (not 
all) appropriated, previous rows (where 1 <.ji i-I) Also consider if current ith 
row will have any immediate contribution to "future" row. 

Step 4 : Return to Step 1 for next row 

A simple, but highly inefficient way to accomplish Step 3 (of the symbolic phase) will be 
identifying the nonzero terms associated with the ith column. For example, there will be no 
"fills-in” terms on row 3 (using the data shown in Eq. 26), due to "no contributions" of the 
previous rows 1 and 2. This fact can be easily realized by observing that the associated 3rd 
column of [K] (shown in Eq. 26) has no nonzero terms. 

On the other hand, if one considers row 4 in the symbolic phase, then the associated 4th 
column will have 1 nonzero term (on row 1). Thus, only row 1 (but not rows 2 and 3) may have 
"fills-in" contribution to row 4. Furthermore, since K I 6 is nonzero (=2), it immediately implies 
that there will be a "fills-in” terms at location U 16 of row 4. 

A much more efficient way to accomplish step 3 of the symbolic phase is by creating 2 additional 
integer arrays, as defining in the following paragraphs 

ICHAINL(i): Chained list for the ith row. This array will be efficiently created to 
identify which previous rows will have contributions to current ith row. The 
dimension for this integer, temporary array is N. 

LOCUPDATE(i) :Updated starting location of the ith row. 

Using the data shown in Eq. (26), uses of the above 2 arrays in the symbolic phase can 
be described by the following step-by-step procedure: 

Step 0: Initialize arrays : 



I 


' i 

1 


2 


2 

ICHAINL ' 


’ = {0} and LOCUPDATE * 

- 


* i 

w 


N 


Step 1 : Consider row i = l 

Step 2 : Realizing that the original nonzero terms occur in columns 4 & 6 

Step 3 : Since the chained list ICHAINL(i=l) = 0, no other previous rows will have 

any contributions to row 1 


ICHAJNL(4) = 1 

( 36 ) 

ICHAJNL(l) = 1 

( 37 ) 


n 




LOCUPDATE(i = 1) = 1 

(38) 

Equations (36-37) indicate that "future" row i=4 will have to refer to row 1, and row 1 will refer 
to itself. Eq. (38) states that the updated starting location for row 1 is 1. 

Step 1: 

Consider row i = 2 


Step 2: 

Realizing the original nonzero term(s) only occurs in column 5 


Step 3: 

Since ICHAINL (i=2) = 0 no other previous rows will have any contributions 
to row 2 


ICHAINL(o) = 2 

(39) 


ICHAINL(2) = 2 

(40) 


LOCUPD ATE(i = 2) = 3 

(41) 

Equations (39^0) indicate that "future" row i=5 will have to refer to row 2, and row 2 
to itself. Eq. (41) states that the updated starting location for row 2 is 3 

will refer 

Step 1: 

Consider row i=3 


Step 2: 

The original nonzero term(s) occurs in column 5 


Step, 3: 

Since ICHAINL( i=2) = 0 no previous rows will have any contributions to 
row 3. 

The chained list for "future" row i=5 will have to be updated in order to include row 
list: 

ICHAINL(3) = 2 

3 into its 
(42) 


ICHAINL(2) = 2 

(43) 


LOCUPDATE(i = 3) = 4 

(44) 


Thus Eqs. (39,43,42) state that "future" row i = 5 will have to refer to rows 2, row 2 will refer 
to row 3, and row 3 will refer to row 2. Eq. (44) indicates that the updated starting location for 
row 3 is 4 

Step 1 : Consider row i=4 

Step 2 : The original nonzero term(s) occurs in col umn 5 
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Step 3: Since ICHAIN(i=4) = 1, and ICHAINL(l) = 1 (please refer to Eqs. 36-37), 
it implies row #4 will have contributions from row 1 only. The updated starting 
location of row 1 now will be increased by one, thus 

LOCUPDATE ( 1 ) = LOCUPDATE(l) + 1 (45) 


Hence, 

LOCUPDATE(l) = 1 + 1 =2 ( please refer to equation 38) (46) 

Since the updated location of nonzero term in row 1 is at location 2 (see Eq. 46), the 
column number associated with this nonzero term is column #6 (please refer to Eq. 31). Thus, 
it is obvious to see that there must be a " fills-in" term in column #6 of (current) row #4. Also, 
since K, 6 = 2. (or nonzero), it implies "future" row i=6 will have to refer to row 1. 
Furthermore, since the first nonzero term of row 4 occurs in column 5, it implies that "future" 
row 5 will also have to refer to row 4 (in additions to refer to rows 2 & 3). The chained list for 
"future" row 5, therefore, has to be slightly updated (so that row 4 will be included on the list) 
as following 


ICHAINL(3) = 2 

(47) 

ICHAINL(2) = 2 

(48) 

LOCUPDATE(i=3) = 4 

(49) 


Notice that Eq. (48) will override Eq. (43). Thus. Eqs. (39,48,47) clearly show that symbolically 
factorizing "future" row i=5 will have to refer to rows 2, then 4 and then 3, respectively. 


Step 1 : Consider row i=5 

Step 2 : The original nonzero term(s) occurs in column 6 

Step 3 : Since 

ICHAINL (1 = 5) =2 
ICHAINL (2) = 4 

ICHAINL (4) = 3 


(39, 

(48, 

(47, 


repeated) 

repeated) 

repeated) 


It implies rows #2, then 4, and then 3 "may" have contributions (or "fills-in” effects) on row 5. 
However, since Kj 6 is originally a nonzero term, therefore, row 2,4 and 3 will NOT have any 
"fills- in" effects on row 5. 

Step 1 : There is no need to consider the last row i=N=6, since there will be no "fills- 
in" effects on the last row 
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It is extremely important to emphasize that upon completion of the symbolic phase, the output 
array: JCOLUMN(-) has to be re-arranged to make sure that the column numbers in each row 
should be in the increasing orders! 

Sparse Numerical Factorization and Forward Backward Solutions 

It is generally safe to say that sparse numerical factorization is more complicated for 
computer coding implementation than its skyline, or variable bandwidth cases. Main difficulties 
are due to complex "book-keeping" (or index referring) process. The "key" ideas in this 
numerical phase are still basically involved the creation and usage of the 2 integer arravs 
ICHAINL(-) and LOCUPDATE(-), which have been discussed with great details in Section 5. 
There are two (2) important modifications that need to be done on the symbolic factorization, in 
order to do the sparse numerical factorization (to facilitate the discussion, please refer to the data 
shown in Eq. 26): 

a) For symbolic factorization purpose, there is no need to have any floating points, arithmetic 

calculation. Thus, upon completing the symbolic process for row 4, there are practically no 
needs to consider row 2 and/or row 3 for possible contributions to row 5. Only row 4 needs 
to be considered for possible contributions (or "fills-in" effects) to row 5 (since row 4, with 
its "fills-in", is already full). 

For numerical factorization purpose, however, all rows 2, then 4, and then 3 will have to be 
included in the numerical factorization of row 5. 

b) For sparse numerical factorization, the basic skeleton FORTRAN code for LDL T , shown in 
Table 3 of Section 5, can be used in conjunction with the chained list strategies (using arrays 
ICHAINL and LOCUPDATE) which have been discussed earlier in Section 5. The skeleton 
FORTRAN code for sparse is shown in Table 4. Comparing Table 4 and Table 3, one 
immediately sees the "major differences" only occur in the 2 do-loop indexes LDL T , on lines 
3 and 6, respectively. 

c) Since the sparse forward and backward equation solution phases require much less 
computational efforts (as compare to factorization phase), their discussions will be omitted in 
this work. 


1. c Assuming row 1 has been factorized earlier 

2. Do 11 I = 2, N 

3. Do 22 K= Only those previous rows which have contributions to 
current row 1 

4. c Compute the multiplier ( Note : U represents L T ) 

5. XMULT = U(K,I) / U(K,K) 

6. Do 33 J = appropriated column numbers of row it K 

7.1 U(I,J) = U(I,J) - XMULT * U(K.J) 

8. 33 CONTINUE 

9. U(K,I) = XMULT 


10. 22 CONTINUE 

11. 11 CONTINUE 


Table 4: Pseudo FORTRAN Skeleton Code For Sparse LDL T Factorization 
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Finding Master (or Super) Degree-of-Freedom (dof) 


To simplify the discussion, assuming that upon completion of the symbolic phase, the 
stiffness matrix [K] will have the following form 


m = 


X X 
X 


X X 
X X 
X X 
X 


( 50 ) 


In Eq. (50), the stiffness matrix [K] has 14 dof. The symbols "x" and "F" refer to the 
original nonzero terms, and the nonzero terms due to "fills-in", respectively. It can be 
seen that rows 1-3 have same nonzero patterns (by referring to the enclosed ''rectangular" 
region, and ignoring the fully populated "triangular'’ region of rows 1-3). Similarly, rows 
4-5 have same nonzero patterns. Rows 7-10 have same nonzero patterns. Finally, rows 
11-14 also have same nonzero patterns. Thus, for the data shown in Eq. (50), the 
"Master" (or "Super") dof can generated as 



( 51 ) 


According to Eq. (51), then the "master" (or "super"") dof are dof # 1 (which is followed 
by 2 "slave" dof), dof ft 4 (which is followed by 1 slave dof), dof # 6 (which has no slave 
dof!), dof ft 7 (which is followed by 3 slave dof), and dof it 11 (which is followed by 3 
slave dof). 

Sparse Matrix-Vector Multiplication (With Unrolling Strategies) 

In our developed sparse equation solver, upon ob taining the solutions, the user has 
the option to compute the relative error norm. For the error norm computation, one needs 
to have efficient sparse matrix (with unro lling strategies) vector multiplication. 
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Furthermore, efficient sparse matrix-vector multiplication's are also required in different 
steps of the subspace and Lanczos algorithms (see Section 4). 

To facilitate the discussions, let's consider the coefficient (stiffness) matrix as shown in 
Fig. 1 . This 14 dof matrix is symmetrical, and it has same nonzero patterns as the one 
considered earlier in Eq. (50). The master/slave dof for this matrix has been discussed 
and given in Eq. (51). The input data file associated with Fig. 1 follows exactly the same 
sparse numerical factorization procedures discussed earlier in Section 5 (see Eqs. 30-33). 
The sparse matrix-vector [A]*{x}, multiplication (with unrolling strategies) can be 
described by the following step-by-step procedures (please also refer to Fig. 1). 

Step 0.1 : Multiplication's Between the Given Diagonal Terms of [A] and vector {x}. 

Step 0.2 : Consider the first "master” dof. According to Fig. 1 (and Eq. 51), the 
first master dof is at row if 1 , and this master dof has 2 associated slave 
dof. In other words, the first 3 rows of Fig. 1 have the same off-diagonal, 
nonzero patterns. 

Step 1 : The first 3 rows (within a rectangular box) of given matrix [A] (shown in 

Fig. 1) operate on the given vector {x}. 

Step 2 : The first 3 columns (within a rectangular box) of the given matrix [A} 

(shown in Fig. 1) operate on the given vector {x}. 

Step 3 : The upper and lower triangular portions (right next to the first 3 diagonal 

terms of the given matrix [A] operate on the given vector {x} , according 
to the orders a (9., 9.), then b (1., 2.), and finally c (1., 2.) (as shown in 
Fig. 1) 

Step 4 : The row number corresponds to the next "master" dof can be easily 

computed (using the master/slave dof information, provided by Eq. 51). 

If the next "master" dof number exceeds N (where N = total number of dof of the given 
matrix [A], then stop, or else return to Step 0.2 (where the "first" master dof will be 
replaced by the "second" master dof etc.) 
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Third Step: 

The upper and lower triangular region will finally be processed 
(according 10 the order a, b, and c, respectively) 
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First Step 
These 3 rows 
will be processed 
( Dot Product Operations ) 


i 1 i 

Second Step : 

These 3 columns will be processed (SAXPY operations) 


Figure 1: Sparse Matrix-Vector Multiplication's With Unrolling Strategies 
Modifications For The Chained List Array ICHAINL(-) 

The Chained list strategies discussed earlier in Section 5 need to be modified in 
order to include the additional information provided by the MASTER dof (refer to, for 
example, Eq. 51). The major modification that need to be done can be accomplished by 
simply making sure that the chained list array ICHAINL(-) will be poi ntin g only toward 
the Master dof (and not toward the slave dof !) 


Sparse Numerical Factorization With Unrolling Strategies 

The Vector unrolling, and loop unrolling strategies that have been successfully 
introduced earlier by the authors for skyline [21] and variable bandwidth [22] equation 
solvers, can also be effectively incorporated into the developed sparse solver (in 
conjunction with the Master dof strategies). Referring to the stiffness matrix data shown 
in Eq. 50, for example, and assuming the first 10 rows of [U] have already been 
completely factorized, thus our objective now is to factorize the current ith (= list) row. 
By simply observing Eq. (50), one will immediately see that factorizing row #11 will 
required the information from the previously factorized row numbers 1,2, 3 ,6, 7,8,9, and 
10 (not necessarily to be in the stated increasing row numbers!)- in the "conventional" 
sparse algorithm. Using "loop-unrolling'’ sparse algorithm, however, the chained list 
array ICHAINL(-) will point only to the "master" dof # 6, # 7 and # 1. 

The skeleton FORTRAN code for LDL T (with sparse matrix) shown in Table 4 (refer to 
Section 5) should be modified as shown by the pseudo, skeleton FORTRAN code in Table 
5. Comparing Table 4 (sparse LDL T factorization) and Table 5 (sparse LDL T 
factorization, with unrolling strategies), one still can recognize the many similarities 
between the 2 sparse algorithms. 
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) 



— 

l.c .. 


Assuming row 1 has been factorized earlier 


2. 


Do 11 1 = 2, N 


3. 


Do 22 K=Only those previous "master" rows which have contributions to 





current row I 


4.1c 


.Compute the muitiplier(s) ( Note: U represents L T ) 


4.2 


NS LAVE DOF= MASTER (I) - 1 


5.1 


XMULT = U(K,I) / U(K.K) 


5.2 


XMUL ra = U(K+m),D/U(K+m.K+m) 


5.3c 


. m = l,2 ... SLAVE DOF 


6 


Do 33 J = appropriated column numbers of " master " row # K 



7.1 


U(U) = u (IJ) - XMULT * U(K,J) 


7.2 


- XMULm *U(K + m t J) 


8 

33 

CONTINUE 


9.1 


U(K.I) = XMULT 


9.2 


U(K-rmJ) = XMULm 


10. 

22 

CONTINUE 

- 

11 

11 

CONTINUE 


Table 5 : Pseudo FORTRAN Skeleton Code For Sparse LDL T Factorization With 
Unrolling Strategies 


6. NUMERICAL EVALUATIONS OF DIFFERENT GENERALIZED EIGEN- 

SOLVERS 

Based upon the discussions in previous sections, practical finite element models 
(such as Exxon-off-shore Structure, and High Speed Civil Transport Aircraft) are used to 
evaluate the performance of the developed sparse eigen-solvers. Since the codes have been 
written in standard FORTRAN language (and without using any library subroutines), it 
can be ported to different computer platforms (such as SUN-Sparc-20, EBM-R6000/590, 
Intel Paragon, Cray-C90 etc...) with no (or minimum) changes to the codes. The accuracy 
of the developed codes for solving generalized eigen equations can be measured by the 
Relative Error_Norm ( = R.E.N.) which can be computed as : 

R.E.N. = M ~ W<H 

ir<M 1 

The basic subspace iteration code, given in Ref. [1], will be used as a based-line reference. 
This basic subspace iteration code [1] will be compared to the developed basic, "sparse" 
Subspace iteration, and "sparse" Lanczos codes. 

Lumped masses have been used in all examples in this section. 

Example 1 : EXXON Off-Shore Structure 

The finite element model for the EXXON model has been used extensively in 
earlier research works [23-25]. The resulted system of generalized eigen-equations from 
the EXXON model has 23,155 dof. The number of nonzero terms of the original stiffness 




IS 


matrix is 809,427. Using the Nested-Dissection (ND) algorithm, the number of nonzero 
terms (including "fills-in" terms) is 10,826,014. The relative error norm (or R.E.N., 
defined in Eq. 52) and the wall-clock time are presented and explained in Figures 2-3. 

It should be noted here that on the IBM-R6000/590 Workstation, vector processing 
capability is available, where as the vector processing capability is "not" available on the 
Sun Sparc-20 workstation. 

Example 2 : High Speed Civil Transport (HSGT) Aircraft 

The finite element model for the HSGT aircraft has been used extensively in earlier 
research works. The resulted system of linear equations from the HSGT model has 16.152 
dof. The number of nonzero terms of the original stiffness matrix is 373,980. Using the 
Modified Minimum Degree (MMD) algorithm, the number of nonzero terms (including 
"fills-in" terms) is 2,746,286. The numerical performances of 3 generalized eigen-solvers 
are presented in Figures 4-5. 


Figure 2. Exxon model 
(Stretch, IBM-RS600/590 Workstation) 



No. Eigen-values 


Figure 3: Exxon Model 
{USTSU31, Sun-Soarc20 Womstauon) 
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figur* 4: HSCT Aircraft modal 
(S*wcft. IBW-R600I590 Wbrkuaon) 




Figure 5: Aircraft model 
{Rhioo. ODU Sun Sparc-20 Worfcsation } 
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7. CONCLUSION 

In this paper, basic generalized eigen-solution algorithms are reviewed. Major 
computational tasks in Subspace iterations, and Lanczos algorithms have been i dentifi ed . 
Efficient Sparse technologies have been developed, and fully utilized (such as: sparse 
symbolic, numerical factorization with unrolling strategies, sparse forward & backward 
solutions, sparse matrix-vector multiplication's etc...)., in conjunction with the basic 
Subspace Iterations and Lanczos algorithms for efficient solutions of the generalized eigen- 
equations. Numerical results from practical finite element models have clearly indirared 
that the proposed "sparse" Subspace iterations, and Lanczos algori thms have offered 
substantial computational advantages over the traditional "skyline", or "variable 
bandwidth" strategies. 
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PROGRAM S PARS EPACK9 7 

£* + + +★*+* + + ★**★*★★*★+ + + * + ★* + ****+**★*+**** + ★★*** + ★**★*★****★****★★** 

c 

c EQUATION- EIGEN SOLVER FOR SPARSE POSITIVE DEFINITE SYSTEMS 

c 

c H. Runesha : runesha@cee.odu.edu 

c Prof. Due Nguyen : nguyen@cee.odu.edu 

c 

c Last update: Dec 20 , 1997 

c - J'ai ajoute le shift et le consistant mass 

c October 26, 1998 

c > Because J. Qin tries to re-use memory, error messages and/or 

c wrong answers occur whenever ncoff is less than approx. 2*neq 

c Although this case rarely occurs in real, practical problems 

c (may only occur for "artificial, small scale" examples, however 

c it is annoying to have this "bugs". These bugs have been fixed 

c in this SPARSEPACK version of eigen- solution 

c 

c Two (2) small changes have been made in spamain.f, and spaldln.f 

c which involve mtota and if (ncoff . . . ) 

c 

c > For structures with rigid body motions (such as "floating" structures, 

c without any supports) , the "correct error norm" should be close 

c to 1.0 (instead of close to 0.0), because 

c error norm = | |K*Phi - Lamda*M*Phi| | / | | K* Phi j | 

c since Lamda = eigen-value = 0.0 (corespond to rigid body motion), 

c hence error norm = ||K*Phi - 0.0|| / ||K*Phi|| = close to 1.0 

c For "float structures" we need to apply "shift factor" to avoid 

c "singular" stiffness matrix 

c 

£+ + + + + *******★★★★*** + + **★** + ** + * + ★**■* + *** + + ****++★**•*■* + *******★***** 

c 

c NOTES : 

C 

c MREAD < 

c > 

C NREORD = 

c = 

C LUMP 

c = 

c NEIG 

c 

c N 

c NCOFF = 

c ISHIFT = 

C 

C I BLOCK = 

c = 

c = 

c ITIME = 

c = 

c 
c 
c 

£**** + + ** + *** + + *** + + * + + + ********★ + + ★*** + + * + *★**** + **★******** + ***** + 


0 Read K.* 

0 Read fort.* 

0 no reordering 
3 MMD reordering 

0 CONSISTANT MASS { or Lump.ne.O) 

1 LUMPED MASS 

number of desired eigenvalues and corresponding 

eigenvectors 

number of equations 

number of nonzeros coefficients 

0 no shift 

. NE . 0 perform a shift of value "ishift" 

-1 : Subspace iteration 
0 : Regular Lanczos 

1,2,3 : block Lanczos { of Block = iblock) 

0 save memory 

1 save time 


implicit real*8 (a-h,o-z) 
character*70 title 


c 

c 

c 


real*8 a(48 875 248) 
mtot = 48 875 220 

real*8 a(28 875 248) 
mtot = 28 875 220 

real*8 a(20 000 000) 
mtot = 20 000 000 

real*8 a(9 000 000) 
mtot = 9 000 000 


open (unit = 7, f ile= * K. INFO' , status=' old' , form= ' formatted' ) 
read (7,115) title 
FORMAT (A60) 

read ( 7 , * ) nreord , neig , lump , n , n2 , NCOFF , it ime , ishift , iblock , mread 
write (*, *) nreord, neig, lump, n, n2 , NCOFF, itime, ishift , iblock, mread 
close (7) 


1 3 



ooooo noon 


write (23 , *) 
write (23 , * ) 

title 



write (23 , * ) 

' NEQ 

= ' 

t n 

write (23 , * ) 

' NCOEF 

= ' 

, ncoff 

write (23 , * ) 

' NEIG 

_ t 

, neig 

write { 23 , + ) 

' ISHIFT 

= * 

, ishift 

write (23 , * ) 

' MREAD 

- ' 

, mread 

write ( 23 , * ) 

' LUMP 

= ' 

, lump 

write (23 , *) 

' IBLOCK 

= * 

, iblock 

write (23 , *) 

1 NREORD 

= * 

, nreord 

write ( 23 , * ) 
write (23 , * ) 

' ITIME 

_ i 

, itime 


c 

c if (iblock . eq. 0 . and. lump . ne . 1) then 

c write (23 ,*)' Sorry I The old Lanczos path can not deal with 

c 1 Consistant Mass ! f 
c stop 

c endif 

if { iblock . gt . 3 ) then 

write (23 ,*)' Sorry ! BLOCK size must less than 4 ! ' 

stop 
endif 

c if (iblock. eq. 0 .and. ishift .ne . 0) then 

c write ( 23 , *)' Sorry ! No shift is allowed for the Old Lanczos !' 

c stop 

c endif 

c 

c Read Data 

c 

if (mread. le . 0) then 

CALL QREAD (n, ncof f , mtot , a , mread, neig, lump, ishif t) 

CALL OOCSPA {n, mtot/ 2 , ncof f , a { 1 ) ,a(mtot/2+4) , mread, nllOO) 
endif 

REORDERING 

nreord . NE . 0 means MMD reordering is desired *** 

mtota = max { 4*n , 2*ncof f ) 
mtoti = 3*ncoff + 7+n + 5 
if (mtot-mtoti-mtota . It . 0) then 
write (23 ,*)' SPAOOC : increase mtot to: ' , mtoti+mtota 

stop 
endif 

call cputime (time 0) 

if (nreord . ne . 0) then 
IF (LUMP . EQ . 1) THEN 

call reord (n, ncof f , nreord, mtota, mtoti , a (1) ,a(l+mtota) , neig) 

ELSE 

if ( itime . eq . 0 ) then 

call reord2 (n, ncof f , nreord, mtota , mtoti , a ( 1) , a (1+mtota) , neig) 
else 

call reord3 (n, ncof f , nreord, mtota , mtoti , a (1) , a (l+4*ncof f ) , neig) 
endif 
ENDIF 
endif 


SOLVER 

call sparse (n, ncof f , neig, mtot /2 , mtot/ 2 , a (1) , a (l+mtot/2) ) 

using 4 bytes for integer .... June 7, 1995 

mtot i=mtot/4+ 1000 
mtota=mtot -mtoti 

call sparse (n, ncof f , neig, lump, iblock, mtoti, mtota, a (1) , a (1+mtoti) , 
1 ishift) 

call cputime (timel) 
timel=timel - timeO 
if (neig . ne . 0 ) then 

write (23 ,*)' ***T0TAL CPU FOR EIGENSOLUTION = timel 

write (23 ,*)'*** (This time including norm check & I/O )***' 

write (23, * ) ' MTOTI = mtoti,' MTOTA = mtota 

endif 

stop 

end 


c 


24 



u u 




subroutine oocspa (n, mtota, ncof f , a, iq, mread, nllOO) 

implicit real*8 (a-h,o-z) 

real*8 a(l) 

integer iq ( 1 ) 

if (mread . le . 0 ) then 

rewind (15) 

read (15) (iq (i) , i=n+2 , l+2*n) 
iq (1) =1 
do 10 i=2,n+l 

10 iq (i) =iq (i-1) +iq(i+n) 

rewind (15) 

write (15) ( iq ( i ) ,i=l,n+l) 
else 

if (n . gt . 0 ) stop 
endif 
return 
end 
c 

c 


SUBROUTINE QREAD (neq, ncoef f # mtot , a, mread, neig, lump, ishif t) 
real*8 a(l) 

call iread {neq, ncoef f , mtot , a , mread, lump) 
c dec 20 1997 : Runesha 

c if ( lump . ne . 1 . or . ishif t . ne . 0 ) then 

c call rreadpierrot (neq, ncoef f, neig, lump, ishift,a(l) ,a(n+l) , 

c + a (ncoef f+n+1) , a (ncoef f +2 +n+l ) , a (ncoef f +3*n+l) ) 

c else 

call rread (neq, ncoef f , mtot , a, mread, neig, lump) 
c endif 

return 
end 


c ^ +******★*★★*★**★ 

c 


SUBROUTINE RREAD (neq, ncoef f , mtot , a, mread, neig, lump) 
real*8 a(l) 
if (mread . It . 0) then 

OPEN (unit=54 , f ile= ' Kll . COEFS ' , form*' formatted' , status* ' old' ) 
OPEN (unit* 55 , file*' K . DIAG' , form* ' formatted' , status* ' old' ) 
OPEN (unit=56 , file='K.RHS' , form*' formatted' , status* 'old' ) 
OPEN (unit=57 , f i 1 e * ' K . DMAS S ' , form*' formatted' , status* ' old' ) 
if (neig .ne . 0 . and. lump . ne . 1) OPEN (unit* 5 8 , file*' K.CMASS ' , 

If orm* ' formatted' , status* ' old' ) 
else 

OPEN (unit*55 , file*' K. DIAG. B # , 

3 form* ' unformatted' , status* 'old' ) 

OPEN (unit* 54 , f ile* ' Kll . COEFS . B ' , 

4f orm* ' unformatted' , status* ' old' ) 

OPEN (unit = 56 , f ile* ' K. RHS . 3 ' , 

5f orm* ' unformatted' , status* ' old' ) 

OPEN (unit = 57 , f ile* ' K . DMASS . B ' , 

5 form* ' unformatted' , status* ' old' ) 
if (neig . ne . 0 . and. lump .ne . 1) OPEN (unit = 5 8 , file*' K.CMASS . B' , 

If orm* ' unformatted' , status* ' old' ) 
endif 

jfile = 12 

ISEG = ( ncoef f - 1 ) / mtot + 1 
DO 100 I = l,iseg 
istart = (i-1) *mtot + 1 
iend * min (ncoef f , i*mtot) 
length = iend - istart + 1 
if (mread . It . 0 ) then 

read (42+ j file , *) (a(k) ,k=l, length) 

c read (42+ jfile, 9901) (a(k) ,k=l, length) 

9901 format (6E12 . 5) 
else 

read (42+j file) (a (JO ,k=l, length) 
endif 

rewind ( j file) 

write (jfile) (a(k) ,k=l, length) 

100 continue 

if (mread . It . 0) then 

read ( 55 , * ) (a ( i) , i=l , neq) 





c read (55,9901) (a (i) , i = l,neq) 

else 

read(55) (a (i) , i = l , neq) 
endif 

rewind (13 ) 

write{13) (a(i) , i=l,neq) 

c write (*,*) 'Inside rread AD=' , (a (i) , i = i , neq) 

if (mread. It . 0) then 

read(56,*) (a ( i) , i = l , neq) 
c read ( 56 , 9901) (a (i) , i = l , neq) 

else 

read(56) (a ( i) , i=l , neq) 
endif 
rewind (14 ) 

write (14) (a(i) ,i=l,neq) 
if (neig . eq . 0 ) return 
if (mread. It . 0) then 

read(57,*) (a ( i) , i = l , neq) 
c read ( 57 , 9901) (a (i) , i=l , neq) 

else 

read(57) (a (i) , i = l , neq) 
endif 
rewind (10) 

write (10) (a (i) , i=l , neq) 
if (neig . ne . 0 . and . lump . ne . 1) then 
if (mread. It . 0 ) then 

read (58,*) (a (i) , i=l , ncoef f ) 

else 

read(58) (a (i) , i=l, ncoef f ) 
endif 
rewind ( 17 ) 

write (17) (a ( i) , i=l , ncoef f) 
endif 
return 
end 
c 

£★**★★**+ + **★★ + +**++ + + + + ★★★*★★*★*★★*★*★+++ + + +*+**★★★★★★* + *★****★★★**★ + ** + 

c 

SUBROUTINE IREAD (neq, ncoef f , mtot , ia, mread, lump) 
integer ia(l) 
if (mread . 1 t . 0 ) then 

OPEN (unit = 52 , f ile= ' K. PTRS ' , form= ' formatted' , status= ' old' ) 

OPEN (unit = 5 3 , f ile= ' Kll . INDXS ' , form=' formatted' , status=' old' ) 
else 

OPEN (unit =52, f ile= ' K . PTRS . B ' , form= 

1 ' unformatted' , status*' old' ) 

OPEN (unit = 53, file=' Kll. INDXS .B' , 

2 f orm= ' unformatted' , status= 'old' ) 
endif 

jfile = 11 

IS EG = ( ncoef f - 1 ) / mtot + 1 

DO 100 I = l,iseg 
istart = (i-l)*mtot + 1 
iend = min (ncoef f , i*mtot) 
length = iend - istart + 1 
if (mread . It . 0) then 

read (42 +j file,*) (ia(k) / k=l, length) 
else 

read { 42+ j file) (ia(k) ,k=l, length) 
endif 

rewind ( j file) 

write (jfile) (ia (k) , k=l, length) 

100 continue 

if (mread . It . 0 ) then 

read ( 52 , * ) (ia(k) ,k=l,neq) 
else 

read (52 ) (ia(k) , i=k,neq) 
endif 
rewind ( 15 ) 

write(15) ( ia ( k) ,k=l,neq) 
return 
end 
c 

c -** + + *★★★★★★★★ + ★★★ + + ★ + + + + ★★★★★★★★★★ + ★★* + **★★** + ★★★★*+*** + ★***★* + ********* 

c 

subroutine rreadpierrot (n, ncoef , neig, lump, ishif t , ad, an, b, dm, am) 
implicit real*8 (a-h,o-z) 
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dimension ad(*), b{*), an(*}, dm(*),am(*) 


OPEN (unit=54 , file='Kll. COEFS ' , form=' formatted' , status='old' ) 
OPEN (unit = 5 5 , f ile= r K. DIAG' , form= ' formatted' , status= ' old' ) 
OPEN (unit=56 , file- ' K . RHS ' , f orm= ' formatted' , status= ' old' ) 
c OPEN (unit = 57 , f ile= ' K. DMASS ' , form= ' formatted' , status = 'old' ) 

c if (neig . ne . 0 . and. lump . ne . 1) OPEN (unit=58 / file=' K. CMASS ' , 

c If orm= ' formatted' , status^ ' old' ) 

c 

c perform also a shift on the diag values 

c 

write ( * , * ) ' ISHIFT = ' , ishift 

read (54, *) (an(i) , i=l f ncoef ) 
read (55, *) (ad ( i) , i=l , n) 

write(*,*) ' AD before = ', (ad (i) , i=l , n) 
read(56,*) (b(i),i=l,n) 

c read(57,*) (dm ( i ) , i=l , n) 

close ( 54 ) 
close { 55 ) 
close (56) 
c close(57) 

IF(NEIG.NE.O) THEN 

OPEN (unit=57 , f ile= ' K . DMASS' , form=' formatted' , status= ' old' ) 
read (57,*) (dm (i) # i=l , n) 
close (57) 
rewind ( 10 ) 

write(lO) (dm ( i ) , i=l , n) 
if (ishift . ne . 0 ) then 
do i=l,n 

ad (i) =ad (i) + f loat (ishift) *dm (i) 

write (*, *) dm(i) # float (ishift) *dm(i) , ad(i) 

enddo 

endif 

if (lump . ne . 1 ) then 

OPEN (unit = 58 , f ile=' K. CMASS ' , form= ' formatted' , status®' old' ) 
read(58,*) (am ( i) , i=l , ncoef ) 
close ( 58 ) 
rewind ( 17 ) 

write(17) (am ( i ), i=l , ncoef ) 
if (ishift . ne . 0) then 
do i=l, ncoef 

an (i) =an (i) +f loat (ishift) *am (i) 
enddo 
endif 
endif 
ENDIF 
c 

c write on fort.* 

c 

rewind (13 ) 

write (13) (ad ( i) , i = l , n) 

write (*,*) ' Inside rread AD =' , (ad ( i) , i=l , n) 

rewind ( 12 ) 

write(l2) (an (i) , i=l , ncoef ) 
rewind ( 14 ) 

write(14) (b(i),i=l,n) 

return 

end 




c 

subroutine cputime (time) 
real tar (2) 
real*8 time 
time=etime (tar) 
c time=TSECND ( ) 

c time=0 . 01*mclock ( ) 

return 
end 


c 

c 




subroutine sparse [n, ncoff ,neig, lump, iblock, mtoti , mtota, iq, a, 
1 ishift) 

implicit real*8 (a-h,o-z) 
real*8 a(l) 
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non 


integer iq(i) 
ierr=0 

c purpose: sparse gauss version equation solver 

c for system of symmetrical equations. 

c J. Qin Dec .28, 1992 

c H. Runesha June 1996 

£+++******** ****************************+**+**+*+++**++*++ ********** 
C*** UDU factorization ******* 

mtoti=mtoti*2 
do 81 i=l , ncof f +n+n 
81 a ( i) =0 . OdO 


call cputime (timer 0) 
rewind (15 ) 

read ( 15 ) ( iq (i) , i = l , n+1) 
rewind (11) 

read(ll) ( iq (i) , i=n+2 , n+l+ncof f ) 
rewind (13 ) 

read{13) (a(i) ,i=l,n) 
rewind (12) 

read (12) (a ( i) , i=2 *n+l , 2*n+ncof f ) 
rewind (14 ) 

read(14) (a ( i) , i=l+n, 2*n) 
call cputime (timerl) 
call cputime (timeOO ) 
iqmax = mtot i -ncof f -6*n- 3 

call symfac (n, iq (1) , iq(n+2) , iq (n+2+ncof f ) , iq (ncof f +2*n+3 ) , 

1 a (2*n+ncof f +1) , ncof 2 , iqmax) 

call cputime ( timed} 

write (23 , *) ' neq = ' ,n 

write (23 f *) ' before fill in, ncoff = ',ncoff 

write (23 , *)' after fill in, ncof2 = ',ncof2 

write (23 , *) ' Total integer memory used = ncoff +ncof2+ 

1 6*n+3 

write (23 , *)' Total real memory used = ' , ncof f +ncof 2+4*n 

if (mtoti.lt. (ncof f +3+6*n+ncof 2 ) ) then 

write (23,*) 'The integer array iq should > ' , ncof f +3+6*n 

1 +ncof2 

ierr=l 
endif 

if (mtot a . It . (4*n+ncof f +ncof 2 ) ) then 

write (23,*) 'The real array a should > ' , 4*n+ncof f +ncof 2 

ierr=2 
endif 

if ( ierr . ne . 0 ) stop 
call cputime (time02) 

call transa (n, n, iq (n+2+ncof f ) , iq (ncof f +2*n+3 ) , a (ncof f +2*n+l) , 

1 a (ncof f +3*n+2) ) 

call cputime (time03) 

call supnode (n, iq (n+2+ncof f ) , iq (ncof f +2*n+3) , iq (ncof f +4*n+3 
1 +ncof2) ) 

*** if neig . ne . 0 . and . ishif t . ne . 0 ) do shifting: K~=K+ishif t *M 
cpier 

c If ( iblock . gt . 0) then 

IF (NEIG. NE. 0 .AND. ISHIFT.NE. 0) THEN 
rewind ( 10 ) 

read (10) (a (2*n+ncof f +i) , i=l,n) ! read dmass 

if (lump . ne . 1) then 
rewind ( 17 ) 

read(17) (a (3*n+ncof f +i) , i=l, ncof f ) ! read am for consist, mass 

endi f 
C 

do i = 1 , n 

a ( i) = a ( i) + f loat (ishift ) * a (2*n+ncof f + i) l for diagonals 

enddo 

rewind (13 ) 

write(13) (a(i),i=l,n) 

if (lump . ne . 1) then 
do i = 1, ncoff 

a (2*n+i) =a (2*n+i) + float (ishif t) *a (3*n+ncoff+i) 
enddo 
rewind (12) 

write (12) (a(2*n+i),i = l, ncoff) 

endif 
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END IF 


c endif 

cpier 

call cputime (time031) 

c subroutine Numf al (n, ia f ja, ad, an, iu, ju, di, un # ip, iup, tl , isupd 

call numf aq (n, iq (1) , iq (n+2) , a (1) , a (2*n+l) , iq (n+2+ncof f ) , 

1 iq (ncof f +2*n+3 ) , a (2*n+ncof f +1) , a {ncof f + 3 *n+l) , 

1 iq(ncof f +2*n+3+ncof2) , iq (ncof f +3 *n+3+ncof 2) , 

1 a (3*n+ncof f +ncof 2 + 1) , iq (ncof f +4*n+3+ncof 2 ) 

1 , iopf ) 

call cputime (time04 ) 

IF(NEIG.NE.O) THEN 
if ( iblock . eq . - 1 ) then 
write (23,*) 'IBLOCK =', iblock 

c ********** ** SUBS PACE iteration*** 

c DM 

rewind ( 10 ) 

read(10) (a (ncof 2+ncof f +3*n+i) , i=l , n) ! read dmass 

i00=ncof 2+ncof f +3*n+l 

i01=i00+n 

ncmass=l 

if (lump . ne . 1) then 
ncmass=ncof f 
c AM 

rewind ( 17 ) 

read(17) (a ( i01-l + i) , i = l , ncof f ) 1 read am for consist, mass 

endif 

iiql=min (neig+8 , 2*neig) 
iiq=iiq+l 
iiq=min ( iiql , n) 
c iiq=n 

mtot=mtoti+mtota 

i02 = i01-*-ncmass 

i03=i02+iiq*iiq 

i04 = i03 4*iiq*iiq 

i05=i04+iiq*iiq 

i06=i05+iiq 

i07=i06+n 

i08=i07+n 

i09=i08+n 

il0=i09+n*iiq 

if (ilO.gt .mtota) write(*,*) ' NOT ENOUGH MEMORY ' 

call spasubspace ( iiq, n, ncof f , neig, lump,mtot, iq(l) , iq(n+2) 

+ , a ( 1) , a ( 2 *n+l ) ,a(i00) ,a(i01) , iq (n+2+ncof f ) , iq (ncof f +2*n+3 ) 

+ , a (2*n+ncof f + 1) , a (ncof f +3*n+l) ,a(i02) ,a(i03) ,a(i04) 

+ ,a(i05),a(i06),a(i07),a(iQ8),a(i09), ishift) 

return 


endif 

c ****** end subspace iter.******** 


if (iblock. ge . 1 ) then 

if ( iblock . gt . 6 ) iblock = 6 


C 

0 *********** Block Lanczos eigensolver: 

C** Iq() : l->n+l =ia; n+2-> ja; n+2+ncof f -> iu; 2*n+3+ncof f - > ju; 
C** total available = mtoti- (2*n+5+ncof f +ncof 2) 

ileft = (mtoti- ( 2 *n+ 6 +ncof f +ncof 2 ) ) 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


isize = min ( (neig* 5 *iblock) / iblock, (n/iblock) *iblock) 
isize= ( isize/ iblock) *iblock 

ineed = isize*isize *6 + iblock*iblock *6 + isize *2 
write (*,*)' ileft , ineed = ', ileft , ineed 
if ( ineed. It . ileft ) then 


+ 1 


100 

101 

102 

103 

104 

105 

106 
else 

write (23 
stop 


2 *n+ 6 +ncof f +ncof 2 

i 00 +isize*isize *2 

i 01 +isize*isize *2 

i 02 +isize*isize *2 

i 02 +iblock*iblock *2 

i 02 +iblock*iblock *2 

i05+iblock* iblock* 2 


*) 'Increase mtoti to 


T (isize, isize) 

B (isize, isize) 
vec (isize, isize) 
alfa (iblock, iblock) 
beta (iblock, iblock) 
betai (iblock, iblock) 
eig (isize) 


, ineed+2*n+5+ncof f +ncof 2 
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c 

C** 


c ★ * * 


endif 

a ( ) : I->n = ad; n->2*n = RHS ; 2*n+l-> = an; di; 
j 02 = 2*n+ncoff + 1 ! 

if (lump . ne . 1) then 

consistant Mass matrix, using ad, an space for: dmass,am 


un ; 1 1 ; 

di, unchanged 


1 

j 00 + n 

3*n+ncof f +ncof 2+1 
j 03 + n*iblock 
j 04 + n*iblock 
j 05 + n*iblock 


dmass 
am 

r (n, iblock) 
p (n, iblock) 
tern (n, iblock) 
Q 


Q* * 


joo 

jOl 
j 03 = 
j 04 = 
j 05 = 
j 06 = 
else 

lumped mass matrix: 

ibmax = (2*n+ ncoff)/(3*n) 

write (*,*}' ibmax, iblocki , isize = ' , ibmax, iblock, isize 
if (ibmax . ge . iblock) then 
jOO = 3*n+ncof f +ncof 2+1 
j 01 = j 0 0 + n 


3 03 
j 04 
j 05 
j 06 
else 

joo 
jOl 
j 03 
j 04 
j 05 
j 06 


1 
j 03 
j 04 
j 01 


n*iblock 
n* iblock 
1 


1 

1+n 

2+n 

3 *n+ncof f +ncof 2+1 
j 04 + n+iblock 
j 05 + n*iblock 


dmass 

am 

r (n, iblock) 
p (n, iblock) 
tern (n, iblock) 
Q 

dmass 

am 

r (n, iblock) 
p (n, iblock) 
tern (n, iblock) 
Q 


write (*,*)' 2*n+ncof f , j 03 +n* iblock = ' , 2+n+ncof f , j 03 +n* iblock 
endif 


T { isize , isize) 

B { isize, isize) 
vec (isize, isize) 
alf a ( iblock, iblock) 
beta ( iblock, iblock) 
betai (iblock, iblock) 
eig ( isize) 

' , i06+isize 


Q+ + * * 

c 


endif 

100 = j 06 + n* (isize + iblock) 

101 = i00+isize*isize 

102 = i01+isize*isize 

103 = i02+isize*isize 

104 = i02+iblock*iblock 

105 = i02+iblock*iblock 

106 = i05+iblock*iblock 
if(i06+isize.gt. mtota) then 

write (23 , *) r Increase mtota to 
stop 
endif 

write (*,*) 'before call blanmain: lump = r , lump 

call blanmaintn, isize, iblock, neig, a { j 02+n) ,a(j02) , iq(n+2+ncoff ) , 

1 iq (2*n+3+ncof f ) , a { j 03 ) ,a(j04) ,a(j05) ,a(j00) , 

2 a(i04) ( a ( i05 ) ,a(i06) ,a(i02) ,a(i03) ,a(j06) , 

3 a (iOO) , a (iOl) , a ( jOl) , iq(l) , iq (2+n) , lump, ncof f , ishif t) 
return 

endif 

C ################################################################^##### 
C 

q***^** f or regular Lanczos : 
ax = neig**2 
bx = 3*neig 
cx = n - ncoff 

if (bx*bx-4 . 0*cx*ax . le . 0 . 0) then 
ixl = 1 
go to 1995 
endif 

xl = ( -bx+sqrt (bx*bx- 4 . 0*cx*ax) ) *0 . 5/ax 

ixl=xl 

1995 lanmax = min (8*neig, neig^ixl) 

write (23 ,*) ' LANMAX = ' , lanmax, xl*neig 
c lanmax = 4*neig 

c item = neig* (12 + 16*neig) + n 

c if (item. gt . ncof f) then 

c lanmax = 3*neig 

item = neig* (9 + 9*neig) + n 
c endif 

i j 0 = 2+4*n+ncof f +ncof 2 
if (ij 0 + lanmax**2 .gt .mtota) then 
write (23 ,*)' Increase Memory (real) to 
stop 
endif 


' , i j 0+lanmax**2 


c 

c 


if (lanmax . It . 3*neig) then 

Pierrot , je fermer ca mais il faut le verifier 

write ( 23 ,*)' Increase memory for: VEC (lanmax, lanmax) &EIG, ERR, TEM' , 


3o 



c 1 item, 'memory available now is: ',ncoff 
lanmax=4 *neig 

if (lanmax . gt .n+2 ) lanmax=n+l 
j 00=l+4*n+ncof f +l+ncof 2 
i j 0 = j 00+l+3*lanmax+lanmax**2 

if (j 00+lanmax* 3 +lanmax**2+n* lanmax .gt .mtota) then 
write (23 ,*)' Increase memory mtota to : ' , j 00+3*lanmax+ 

1 lanmax* *2 
stop 
endif 

write (23 , * ) ' lanmax, j 00 , i j 0 , ncof f , ncof 2= ' , lanmax, j 00 , i j 0 , ncof f , 
Incof 2 

jOl = 2+n 

if (ncof f . It . 2*n) then 

j 01 = 2*n+3+ncof f +ncof 2 + 1 
endif 

if ( lump . eq . 1 ) then 

CALL SP2LAN (n, lanmax, neig, a (ncof f +3*n+l) , a (2*n+ncof f +1) , iq (n+2+ 
Incof f) , iq (ncof f +2*n+3 ) , a ( 1) , a ( 1+n) ,a(3 *n+l+ncof f +1+ 

2ncof2) ,a(l+2*n) ,a(j00 + l) , a ( j 00+1+lanmax) , 

3 a ( 1 + j 00 + 2* lanmax) , a ( 1+ j 0 0 + 3* lanmax) , ncof f , ncof 2 , iq ( j 01) ,a(ij0) 

+ , lump , ishif t ) 
else 

jOl = 2 *n+3+ncof f +ncof 2 + 1 

c pierrot a ajouter ceci pour introduire ia ja et am 

j 01 = 2*n+3+ncof f +ncof 2 + 1 

if ( j 00+lanmax*3+lanmax**2+n*lanmax+ncof f .gt .mtota) then 
write (23 ,*)' Increase memory mtota to : 7 , j 00+3*lanmax+ 

1 lanmax** 2 +ncof f 
stop 
endif 

CALL SP2LAN2 (n, lanmax, neig, a (ncof f + 3*n+l ) , a ( 2*n+ncof f +1) , iq (n+2 + 
lncoff) , iq (ncof f +2*n+3) ,a(l) ,a(l+n) , a ( 3 *n+l+ncof f +1+ 

2ncof 2) ,a(l + 2*n) , a ( j 00 + 1) , a < j 00 + 1+lanmax) , 

3a ( 1 + j 0 0 + 2 * lanmax) , a (1 + j 00 + 3* lanmax) , ncof f , ncof 2 , iq ( jOl) , a (i j 0) 

+ , a ( i j 0+lanmax*n) , lump, ishif t, iq (1) , iq (n+2) j 

endif 

return 
endif 
jOl = 2+n 

if (ncof f . It . n) then 

jOl = 2 *n+3+ncof f +ncof 2 + 1 
endif 

if ( lump . eq . 1) then 

CALL SP2LAN (n, lanmax, neig, a (ncof f+3*n+l) , a (2*n+ncof f +1) ,iq(n+2+ 
lncoff) , iq (ncof f +2*n+3 ) ,a(l) ,a(l+n) , a { 3 *n+l+ncof f +1+ 

2ncof 2 ) ,a(l+2*n) ,a(l+3*n) , a ( 1 + 3 *n+lanmax) , 

3a ( 1 + 3 *n+2* lanmax) , a (1+3 *n+3* lanmax) , ncof f , ncof 2 , iq( j 01) , a (ij 0) 

+ , lump , ishif t) 

else 

c pierrot a ajouter ceci pour introduire ia ja et am 

j 0 1 = 2*n+3+ncof f +ncof 2 + 1 

if ( j 00 + lanmax* 3 + lanmax** 2+n* lanmax+ncof f . gt .mtota) then 
write (23 , *)' Increase memory mtota to : ' , j 00+3*lanmax+ 

1 lanmax* *2 +ncoff 
stop 
endif 

CALL SP2LAN2 (n, lanmax, neig , a (ncof f+3*n+l ) , a (2*n+ncof f +1) ,iq(n+2+ 
lncoff) , iq (ncof f+2*n+3) ,a(l) ,a(l+n) , a ( 3 *n+l+ncof f +1+ 

2ncof 2 ) , a ( 1 + 2 *n) ,a(l + 3*n) , a ( 1 + 3 *n+lanmax) , 

3a ( 1 + 3 *n+2* lanmax) , a ( 1 + 3 *n+3* lanmax) , ncoff , ncof 2 , iq ( jOl) , a (ijO) 

+ , a (ij 0+lanmax*n) , lump, ishif t , iq (1 ), iq (n+2) ) 

endif 


return 

ELSE 

call fbe (n, iq (n+2+ncof f ) , iq (ncof f +2*n+3 ) , a (2*n+ncof f +1) , 

1 a (ncof f +3*n+l) , a (n+1) , a (3*n+ncof f +ncof 2+1) , iopfb) 

call cputime ( time0 5) 

call multspa (n, iq ( 1) ,iq(n+2) ,a(2*n+l) ,a(l) , a (3*n+ncof f + 

1 ncof2 + l) , a (2*n+ncof f + 1) ) 
call cputime (time06 ) 
sum=0 . OdO 
sum2=0 . OdO 
j l=3*n+ncof f +ncof 2 
j 2 = 2*n+ncoff 


SI 



dmax = 0 . OdO 
bnorm = 0 . OdO 
axbn = 0 . OdO 
do 2000 i = l,n 
j = j 1 + i 
jj = j2 + i 

bnorm = bnorm + a (n+i) *a (n+i) 
axbn = axbn + (a ( j j ) -a (n+i) ) * + 2 
if (abs (a ( j ) } .gt .dmax) then 
dmax=abs (a ( j ) ) 
imax=i 
endif 

sum=sum+abs {a { j ) ) 
sum2 =sum2 + ( a ( j ) ) 

2000 continue 

axbn = sqrt (axbn) 
bnorm = axbn/sqrt (bnorm) 

write (23,+) 'The Max. Displacement = ' , a ( j 1 + imax) , ' at the 
1 ',imax,' -th D.O.F.' 

write (23,*) 'The ABS sum. of the displacements= ' , sum 
write (23, * ) 'The sum. of the displacements =',sum2 
write (23 ,*)' The | Ax - b I | =',axbn 

write (2 3, *) 'The | Ax - b | / ||b|| =', bnorm 

write (23 ,+)' Time for error norm check =' , time06- time05 

write (23 ,*)' Time for reading files = ' , timerl- timerO 

write (23 ,*)' Time for symbolic factorization = ' , timeOl- timeOO 
write (23 ,*)' Time for reordering = ' , time03 - time02 

write (23 ,*)' Time for Finding out supernodes = ' , time031- time03 
write (23 ,*)' Time for numeric factorization = ' , time04- time031 
write (23 ,+)' Time for forward/backward solve = ' , time05-time04 
write (23 , + ) ' Total time = ' , timeOl- timeOO 

1 +time05- time02 

write ( 23 ,+)' Total Operations in Factorization: iopf 
write (23 ,*)* Total Operations in Forw/Back. : ' , iopfb 
c write (23 , *)' Mf lops for factorization , float (iopf ) * 

c 1 0 . 000001/ (time04-time031) 

loop = 8 

write (23,*) 'LOOP UNROLLING LEVEL =',LOOP 

c write (23 ,*)' Mflops for forward/backward float (iopfb) + 

c 1 0 . 000001/ (time05-time04 ) 

ENDIF 

return 

end 

subroutine symfac (n, ia, ja, iu, ju, ip, ncof2 , iqmax) 
implicit real*8 (a-h,o-z) 
integer ia (1) , ja (1) ,iu(l) ,ju(l) , ip (1) 
nm = n - 1 
nh = n + 1 
do 10 i = 1, n 
iu ( i ) = 0 
10 ip(i) = 0 

jp = 1 

do 90 i = l,nm 
jpi = DP 

if ( jpi . gt . iqmax) then 

write (23 ,*)' Symbolic : increase ju() bigger than: ',jpi, 

1 '** now we only have iqmax = iqmax 

stop 
endif 

jpp = n + jp - i 
min = nh 
iaa = ia(i) 
iab = ia(i+l) - 1 
if ( iab. It. iaa ) go to 30 
j 0 = jpi- iaa 
CDIR$ IVDEP 


— 


do 20 j = iaa, 
jj = ja(j) 
ju(j0 + j) = jj 

iab 



20 

iu ( j j ) * i 
jp= jp+ ( iab-iaa) 

+ 1 



30 

min=ja (iaa) 
last = ip(i) 
if ( last . eq. 0 ) 
1 = last 

go to 

60 

— 

40 

1 = ip ( 1 ) 
lh = 1 + 1 





iua = iu(l) 
iub = iu(lh) - 

1 




if ( lh.eq. i ) 

iub = 

jpi 
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1 


iu (i) = 

CDIR$ IVDEP 

do 50 j = iua, iub 
j j = ju(j) 

if ( iu(jj).eq.i ) go to 50 
ju(jp) = jj 
jp = jp + 1 
iu ( j j ) = i 

if( min.gt.jj ) min = jj 
50 continue 

if ( jp.eq.jpp ) go to 70 

if ( l.ne.last ) go to 40 

60 if ( min.eq.nh ) go to 90 

70 I* ip (min) 

if ( 1 . eq. 0 ) go to 80 
ip { i) = ip ( 1 ) 
ip ( 1 ) = i 
go to 90 

80 ip (min) = i 

ip ( i ) = i 

90 iu(i) = jpi 

iu (n) = jp 

iu(nh) = jp 
ncof 2=iu (n) 
return 
end 

subroutine fbe (n, iu, ju, di , un, b , x, iopfb) 
implicit real*8 (a-h,o-z) 
real*8 un { 1) , di (1) , b (1) , x (1) 
integer iu(l),ju(l) 
iopfb=0 
nm = n - 1 
do 10 i = 1, n 
10 x (i) = b (i) 

do 40 k = 1, nm 
iua = iu(k) 
iub = iu(k+l) - 1 
xx = x(k) 

if ( iub. It. iua ) go to 30 
CDIR$ IVDEP 

do 20 i = iua, iub 

j j = j u (i) 

20 x ( j j ) = x(j]) - xx*un (l) 

iopf b=iopfb+2 * (iub-iua+l) 

30 continue 

40 continue 

do 41 jj = 1 , n 

41 x ( j j ) = x( j j ) *di ( j j ) 
iopfb=iopfb+n 




k s nm 


50 

iua s iu(k) 

iub = iu(k+l) - 1 

— 


if ( iub.lt. iua ) go to 70 
xx = x ( k ) 


CDIR$ 

IVDEP 



do 60 i = iua, iub 
j j-ju(i) 


60 

xx = xx - un(i)*x(j]) 
x (k) =xx 

iopfb=iopfb+2 * (iub-iua+l) 


70 

k = k -1 


if ( k.gt.O ) go to 50 

return 

end 

subroutine transa (n,m, ia, ja, iat, jat) 
implicit real*8 (a-h,o-z) 
integer ia (1) , ja (1) , iat (1) , jat (1) 
mh = m + 1 
nh = n + 1 
do 1 0 i = 2 , mh 
10 iat(i) = 0 

iab = ia(nh) - 1 
do 21 j j si, n 
CDIR$ IVDEP 

do 20 i = ia ( j j ) , ia(j j+1) -1 
j = ja (i) + 2 
iat(j) = iat(j) + 1 
continue 
continue 


20 

21 


zz 




iat { 1 ) = 1 

iat ( 2 ) = 1 

if ( m.eq.l ) go 

to 40 

30 

do 30 i = 3, mh 
iat ( i ) = iat ( i) 

+■ iat(i-l) 

40 

do 60 i = 1, n 



iaa = ia(i) 
iab = ia(i+l) - 

1 


if ( iab. It. iaa 

) go to 60 

CDIR$ 

IVDEP 


do 50 jp = iaa, 

iab 


j = ja(jp) + 1 
k = iat ( j ) 
j at ( k ) = i 

50 iat { j ) = iat ( j ) +1 

60 continue 

call tran2 (n, m, iat , jat, ia, ja) 
return 
end 

subroutine tran2 (n, m, ia, ja, iat , jat ) 
implicit real*8 (a-h,o-z) 
integer ia (1) , ja (1) , iat (1) , jat (1) 
mh = m + 1 

nh = n + 1 

do 10 i = 2, mh 
10 iat(i) = 0 

iab = ia (nh) - 1 
do 21 jj = 1, n 
CDIR$ IVDEP 

do 20 i = ia ( j j ) , ia ( j j+1) -1 
j = ja (i) +2 
iat(j) = iat(j) + 1 

20 continue 

21 continue 
iat (1) =1 
iat ( 2 ) = 1 

if ( m.eq.l } go to 40 
do 30 i = 3, mh 

30 iat(i) = iat(i) + iat(i-l) 

40 do 60 i = 1, n 

iaa = ia ( i) 
iab = ia(i+l) - 1 
if { iab. It. iaa ) go to 60 
CDIR$ IVDEP 

do 50 jp = iaa, iab 
j = ja ( jp) + 1 
k = iat ( j ) 
jat( k ) = i 

50 iat ( j ) = iat ( j ) +1 

60 continue 

return 
end 
c 

C*************************************************************************************** 

c 

c subroutine cputime (time) 

c real tar (2) 

c real*8 time 

c time=etime ( tar) 

c time=TSECND ( ) 

c time=0 . 01*mclock ( ) 

c return 

c end 

c 

£*+*+***************★*+++*★★+★+**+★*++***++★★**★**++**★*★★★*★★+*+★*★*★*****+*++★★****★★+ 

c 

subroutine numfaq(n, ia, ja, ad, an, iu, ju, di, un, ip, iup, tl, isupd 
1 , iopf ) 

implicit real*8 (a-h,o-z) 

real*8 an{l) ,ad(l) ,un(l) ,di(l) ,tl{l) 

integer ia (1) , ja (1) , iu (1) , ju (1) , ip (1) , iup(l) , isupd (1) 

LOOP = 8 
iopf =0 

call numf aO (n, ia, ja, an, iu, ju, di,un) 
do 10 j =1, n 
10 ip ( j ) = 0 

do 130 i = 1 , n 
ih = i + 1 
iua = iu(i) 
iub = iu(ih) - 1 
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CDIR$ IVDEP 

do 20 j = iua, iub 
20 di ( ju ( j ) ) = un ( j } 

di(i) = add) 

LN = ip ( i ) 

if { LN.eq.O ) go to 90 
50 L = In 

In = ip ( L) 

ik = min ( isupd (L) , i-L) 

iend = ( ik/ LOOP) * LOOP 

iucO = iup(L) 

iucOO = iucO 

iup(L) = iucO + 1 

iud = iu(L+l) - 1 

idif = iud - iucO 

iopf =iopf +2*ik* (idif+1) + ik 

do 1000 k = L, L+iend- 1 , LOOP 


CDIR$ 


j 2 = 

iu (K+2 ) 

- 

1 - idif 

j 3 = 

iu ( K+3 ) 

- 

1 - idif 

j 4 = 

iu (K+4 ) 

- 

1 - idif 

j5 = 

iu (K+5 ) 

- 

1 - idif 

j 6 = 

iu (K+6 ) 

- 

1 - idif 

j7 = 

iu (K+7) 

- 

1 - idif 

j 8 = 

iu (K+8 ) 

- 

1 - idif 

um = 

un ( iucO ) 

+ 

di (K) 

um2 = 

un ( j 2 ) 

+ 

di (K+l ) 

um3 = 

un ( j 3 ) 

* 

di (K+2) 

um4 = 

un ( j 4 ) 

* 

di (K+3) 

um5 = 

un ( j 5 ) 

•k 

di (K+4) 

um6 = 

un (j6) 

■k 

di (K+5) 

um7 = 

un ( j 7 ) 

+ 

di (K+6) 

um8 = 

un ( j 8 ) 

* 

di (K+7) 


IVDEP 

do 1010 j = 0,idif 
di (ju ( iucO+j ) j =di (ju ( iucO + j )} - 


1010 


1000 


1080 


um*un { iucO+j 
um3*un ( j 3+ j 
um5*un ( j 5+ j 
um7*un ( j 7 + j 


continue 
iucO = iu{K+9) - 1 - idif 

continue 

ileft = ik - iend 
Lend = L+iend 

if ( ileft. eq.O ) go to 1030 

go to (1011, 1012, 1013 , 1014, 1015, 1070, 1080) ileft 

go to 1030 

continue 

j 2 = iu(Lend+2) - 1 - idif 

iu(Lend+3) - 1 - idif 

iu(Lend+4) - 1 - idif 

iu(Lend+5) - 1 - idif 

iu(Lend+6) - 1 - idif 

iu(Lend+7) - 1 - idif 


) -um2*un ( j 2+ j ) 
) -um4*un { j 4 + j ) 
) -um6*un { j 6+j ) 
} -um8*un ( j 8+ j ) 


]3 

j4 

j5 

j6 

j7 

um 


= un(iucO)* di (Lend) 


CDIR$ 


um2 = un(j2) * di (Lend+1) 
um3 = un(j3) * di (Lend+2) 
um4 = un(j4) * di(Lend+3) 
um5 a un(j5) * di(Lend+4) 
um6 = un(j6) * di(Lend+5) 
um7 = un(j7) * di(Lend+6) 
IVDEP 

do 1076 j = 0,idif 

di (ju ( iucO+j ) ) =di (ju ( iucO + j 1 


-um*un ( iucO+j 

- um3*un(j3+j 

- um5*un{j5+j 

- um7*un(j7+j 


) -um2*un ( j 2 + j ) 
) -um4*un ( j 4+j ) 
) -um6*un ( j 6+ j ) 

) 


1076 

1070 


continue 
go to 1030 
continue 
j 2 = iu (Lend+2) 
j3 = iu(Lend+3) 
j 4 = iu(Lend+4) 
j 5 = iu(Lend+5) 
j 6 = iu(Lend+6) 
um = un ( iucO ) * 
um2 = un ( j 2 ) * 

um3 = un ( j 3 ) * 

um4 = un ( j 4 ) * 

um5 = un ( j 5 ) * 

um6 » un ( j 6 ) * 


- 1 - idif 

- 1 - idif 

- 1 - idif 

- 1 - idif 

- 1 - idif 
di (Lend) 
di (Lend+1) 
di (Lend+2) 
di (Lend+3 ) 
di (Lend+4 ) 
di (Lend+5) 
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CDIR$ IVDEP 

do 1066 j = 
di ( ju ( iucO+j 


1 

2 


0 , idif 
) ) =di ( ju ( 


iucO+j ) ) 


1066 

1015 


continue 
go to 1030 
continue 
j 2 = iu(Lend+2) 
j3 = iu (Lend+3) 
j 4 = iu (Lend+4) 
j 5 = iu(Lend+5) 
um = un ( iucO ) * 
um2 = un ( j 2 ) * 

um3 = un ( j 3 ) * 

um4 = un ( j 4 ) * 

um5 = un ( j 5 ) * 


- 1 - idif 

- 1 - idif 

- 1 - idif 

- 1 - idif 
di (Lend) 
di (Lend+1) 
di (Lend+2 ) 
di (Lend+3 ) 
di (Lend+4 ) 


-um+un (iucO+j ) -um2+un ( j2+j ) 

- um3*un { j 3+j ) -um4*un ( j4+j ) 

- um5*un { j 5+ j ) -um6*un ( j6 + j ) 


CDIR$ IVDEP 

do 1016 j = 0, idif 

di ( ju { iucO+j ) ) =di ( ju (iucO+j ) ) -um*un ( iucO+j ) -um2*un ( j2+j ) 

1 - um3*un ( j 3+j ) -um4*un ( j4+j ) 

2 - um5*un(j5+j) 


1016 

1014 


continue 
go to 1030 
continue 
j 2 = iu(Lend+2) 
j 3 = iu (Lend+3) 
j 4 = iu (Lend+4) 
um = un ( iucO ) * 
um2 = un ( j 2 ) * 

um3 = un ( j 3 ) * 

um4 = un ( j 4 ) * 


- 1 - idif 

- 1 - idif 

- 1 - idif 
di (Lend) 
di (Lend+1) 
di (Lend+2 ) 
di (Lend+3 ) 


CDIR$ IVDEP 

do 1017 j = 0, idif 

di ( ju ( iucO + j ) ) =di (ju (iucO + j ) ) -um*un (iucO+j ) -um2+un ( j2+j ) 
1 - um3*un ( j 3 + j ) -um4*un ( j 4 + j ) 


1017 continue 

go to 1030 
1013 continue 

j2 = iu (Lend+2) - 1 - idif 
j 3 = iu { Lend+3) - 1 - idif 
um = un(iucO)* di (Lend) 
um2 = un(j2) * di (Lend+1) 
um3 = un(j3) * di (Lend+2) 

CDIR$ IVDEP 

do 1018 j = 0, idif 

di ( ju (iucO + j ) ) =di (ju (iucO + j ) ) -um*un (iucO + j ) -um2*un ( j2+j ) 
1 - um3*un(j3+j) 


1018 continue 
go to 1030 

1012 continue 

j2 = iu (Lend+2) - 1 - idif 
um = un(iucO)* di(Lend) 
um2 = un(j2) * di (Lend+1) 

CDIR$ IVDEP 

do 1019 j = 0 , idif 

di (ju (iucO + j ) ) =di (ju (iucO + j ) ) -um*un { iucO + j ) -um2*un ( j2+j ) 

1019 continue 
go to 1030 

1011 continue 

um = un(iucO)* di(Lend) 


CDIR$ IVDEP 

do 1020 j = 0,idif 

di (ju (iucO + j ) ) =di (ju (iucO+j ) ) -um+un ( iucO + j ) 
1020 continue 
1030 continue 

if ( iuc00.eq.iud ) go to 80 
j = ju( iucOO + 1 ) 
ip ( L) = ip ( j ) 
ip ( j ) = L 

80 if ( LN.ne.O ) go to 50 
90 um = 1 . 00/di (i) 
iopf =iopf +1 

if ( iub.lt.iua ) go to 120 
CDIR$ IVDEP 

do 100 j = iua,iub 
100 un ( j ) = di ( j u ( j ) ) *um 

if (isupd(i) .eq.0) go to 130 
j = ju(iua) 
ip (i) = ip ( j ) 
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ip ( j ) = i 

120 iup(i) = iua 
130 continue 

do 1900 j = 1 , n 
1900 di ( j ) = 1.0/di(j) 

return 
end 

subroutine supnode (n, iu, ju, isupd) 
integer iu { 1) , ju (1) , isupd (1) 
do 1 0 i = 1 , n 
10 isupd{i) = 1 

c if(n.gt.O) return 

do 100 I = 1, n-1 

ilen = iu(i+l) - iu(i) 
isum = 0 

if {isupd (i) . eq. 0) go to 100 
iO = i + 1 
kO = iu { i ) 

101 continue 

kl = iu{i0) 

if { ilen . ne . iu ( i0 + l) - iu ( iO ) + i0 - i) go to 100 
do 200 jj = kO+iQ-i , k0+ilen-l 
if ( ju ( j j ) . ne . ju (kl) ) go to 100 
kl = kl + 1 
200 continue 

isum = isum + 1 
isupd ( i) = isupd { i) + 1 
isupd ( i+isum) = 0 
iO = iO + 1 
go to 101 
100 continue 

return 
end 

£*****★**+****+******+★******★***+***++★★++*++****★ 
subroutine mult spa (n, ia, ja, an, ad, b, c} 
implicit real*8 (a-h,o-z) 
real*8 an (1) , ad (1) , b (1) , c (1) 
integer ia(l),ja(l) 
do 10 i = 1 , n 
10 c (i) = ad (i) *b (i) 

do 3 0 i = 1 , n 
iaa = ia ( i) 
iab = ia (i+1) - 1 

if { iab.lt. iaa ) go to 30 
u = c (i) 
z = b ( i) 

CDIR$ IVDEP 

do 20 k = iaa, iab 
j = ja(k) 

u = u +an(k) * b ( j ) 

20 c { j ) = c ( j ) + an (k) * z 

c (i) = u 
30 continue 

return 
end 

subroutine numfaO (n, ia , j a , an, iu , ju, di , un) 
implicit real*8 (a-h,o-z) 
real*8 an(l) ,un(l) ,di(l) 
integer ia (1) , ja (1) , iu(l) ,ju(l) 
do 1000 i - l,n 



ih = i + 1 

iua = iu(i) 

iub = iu(ih) - 1 



if ( iub.lt. iua ) go to 

1000 

CDIR$ 

IVDEP 

do 20 j = iua, iub 


20 

di ( ju ( j ) ) = 0 . OdO 

iaa = ia(i) 
iab = ia(ih) - 1 



if ( iab. It. iaa ) go to 

1000 

CDIR$ 

IVDEP 

do 30 j = iaa, iab 


30 

di { j a ( j ) ) = an ( j ) 


CDIR$ 

IVDEP 

do 100 j = iua, iub 


100 

un { j ) = di ( ju ( j ) ) 


1000 

continue 


130 

continue 

return 

end 
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c%V%%%% %%%%%%%%%%%*%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%% 
subroutine numf al (n, ia, j a, ad, an, iu, ju, di , un, ip, iup, isupd, iopf ) 


implicit real*8 (a-h, o-z) 

dimens ion ia ( * ) ,ja{*) , ad ( * ) ,an(*) , iu ( * ) ,ju(*) ,di(*) ,un{*) 
dimension ip (*), iup ( *j , isupd (*) 

C purpose: numerical factorization 

c pp. 265-267 of text book, CE 795/895 

c input: ia,ja,an,ad given matrix A in RR(U)U. 

c iu,ju structure of resulting matrix U in 

c RR(U)0. 

c n order of matrices A and U. 

c output: un numerical values of the nonzeros of 

c matrix U in RR(U)0. 

c di inverse of the diagonal matrix D. 

c working space: ip of dimension N. Chained lists of rows 

c associated with each column, 

c iup of dimension N. Auxiliary pointers to 

c portions of rows. 

c di is used as the expanded accumulator . 

DO 10 J=1,N 
10 IP(J)=0 

c Begin of of 1-st (nested) loop: outer-most loop, for each i-th row 

DO 130 1=1, N 

IH=I+1 

IUA=IU(I) 

IUB=IU ( IH) -1 


IF ( IUB . LT . IUA) GO TO 40 
DO 20 J=IUA, IUB 
20 DI (JU(J) ) =0 . 

IAA=IA ( I ) 

IAB=IA ( IH) -1 
IF ( IAB . LT . IAA) GO TO 40 
DO 30 J=IAA, IAB 
30 DI (JA(J) ) =AN(J) 

40 DI (I) =AD (I) 

LAST=IP (I) 

IF (LAST . EQ . 0 ) GO TO 90 
LN=IP (LAST) 


c begin of 2-nd (nested) loop: considering all APPROPRIATED previous 

c rows (any appropriated rows 1 >i-l) 


50 L=LN 

LN=IP(L) 

IUC=IUP (L) 

IUD=IU (L+l) -1 
UM=UN (IUC) *DI (L) 

c begin of 3-rd (nested) inner-most loop: considering all APPROPRIATED 

c columns (any columns i >n) 

DO 60 J=IUC , IUD 
JJ=JU(J) 

60 DI (JJ) =DI ( JJ) -UN(J) *UM 
UN (IUC) =UM 
IUP (L) =IUC+1 
IF ( IUC. EQ. IUD) GO TO 80 
J=JU(IUC+1) 

JJ=IP (J) 

IF (JJ.EQ. 0) GO TO 70 
IP (L) =IP (JJ) 

IP (JJ) =L 
GO TO 80 
70 IP ( J) =L 
IP (L) =L 

c the following go to statement is equivalent to 2-nd nested loop 

c for factorization 


80 IF (L . NE . LAST) GO TO 50 


90 DI (I) =l.d0/DI (I) 

IF ( IUB. LT. IUA) GO TO 120 
DO 100 J=IUA, IUB 
100 UN ( J) =DI (JU(J) ) 

J= JU ( IUA) 

JJ=IP(J) 

IF ( JJ. EQ. 0) GO TO 110 
IP(I) =IP (JJ) 

IP(JJ) =1 
GO TO 120 
110 IP(J)=I 
IP(I)=I 


33 



120 IUP ( I ) =IUA 
130 CONTINUE 
return 
end 

c% %%%%%%%%% %%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


subroutine numf a2 (n, ia , j a , ad, an, iu, ju, di , un, ip, iup, isupd, iopf ) 
c purpose: numerical factorization 

c this portion of numerical factorization has unrolling level 2 

c Modifications March 30,1995 

implicit real*8 (a-h, o- z) 

dimension ia (* ) , ja (*) , ad (*) , an (*) , iu {*) , ju (*) , di (*) r un (*) 
dimension ip (*) , iup (*) , isupd ( *) 
c DEFINITIONS 

c input: ia,ja,an,ad given matrix A in RR(U)U. 

c iu,ju structure of resulting matrix U in 

c RR(U)0. 

c n order of matrices A and U. 

c output : un numerical values of the nonzeros of 

c matrix U in RR(U)0. 

c di inverse of the diagonal matrix D. 

c working space: ip of dimension N. Chained lists of rows 

c associated with each column, 

c iup of dimension N. Auxiliary pointers to 

c portions of rows. 

c di is used as the expanded accumulator. 

c initialisation du vecteur IP, IP indique si il y a des lignes 

c qui modifie la ligne qu'on factorise 

DO 10 J=1 , N 
10 IP(J)=0 

c Begin of of 1-st (nested) loop: outer-most loop, for each i-th row 

DO 130 1 = 1, N 

c ih,iuab,iu b : ligne suivante, debut ligne i et fin ligne i 

IH=I+1 
IUA=IU ( I ) 

IUB=IU ( IH) -1 


IF (IUB.LT. IUA) GO TO 40 

c initialisation de la portion du working array DI necesssaire 

DO 20 J=IUA, IUB 
20 DI { JU( J) ) =0 . 

IAA=IA ( I ) 

IAB=IA ( IH) -1 

c comment er la ligne suivante 

IF ( IAB . LT . IAA) GO TO 40 

c copier AN dans working array DI et aussi AD dans DI : A >U 

DO 30 J=IAA, IAB 
30 DI (JA(J) ) =AN ( J) 

40 DI (I) =AD (I) 


c voir si il y a une ligne au dessus qui va modifier la ligne I 

c Au depart IP est initialise a ) 

LAST-IP (I) 

IF ( LAST . EQ . 0 ) GO TO 90 
LN=IP (LAST) 

c let LN be the iHEAD of the supemode 

celiminer LN=IP(I) 


c begin of 2-nd (nested) loop: consider all APPROPRIATED previous 

c rows (any appropriated rows 1 >i - 1) which contribute to modify I 


c Debut de la partie vecteur 

50 L=LN 

LN=IP (L) 
loop=2 

m= min (i-1, isupd (1) ) 
iend= (m/loop) ♦loop 

isbegin=l 
isend=l+iend- 1 

c keep a copy of IUC1 and IUD1 of L pour la construction de IP 

IUCL=IUP (isbegin) 

IUDL-IU (isbegm+1) -l 
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do is= isbegin, isend, 2 
IUC1=IUP (is) 

IUD1=IU (is+1) -1 

c verifier si la 2 eme ligne commence au debut de la barriere supenode 

IUC2 = IUP (is + 1) 

UM1=UN(IUC1) *DI (is) 

UM2 =UN ( IUC2 ) *DI (is + 1) 

c begin of 3-rd (nested) inner-most loop: considering all APPROPRIATED 

c columns (any columns i >n) 

DO 60 J=IUC1 , IUD1 
JJ=JU(J) 


60 DI (JJ) =DI (JJ) -UN ( J) *UMl-un(iuc2-iucl+j ) *um2 
UN ( IUCI) =UM1 
un (IUC2) =um2 

c j 'update les IUP mais pour les IP je ne le fait que pour tout le supernode 

IUP (is) =IUC1+1 
iup (is+1) =iuc2+l 
enddo 

do is=isend+l , m+1- 1 
IUC1=IUP (is) 

IUD1 = IU ( is + 1 ) -1 
UM1=UN ( IUCI ) *DI (is) 


c begin of 3-rd (nested) inner-most loop: considering all APPROPRIATED 

c columns (any columns i >n) 


DO 62 J=IUC1 , IUD1 
JJ=JU(J) 

62 DI (JJ) =DI (JJ) -UN(J) *UM1 
UN (IUCI) =UM1 

c j' update les IUP mais pour les IP je ne le fait que pour tout le supemade 

IUP (is) =IUC1+1 
enddo 


c j ' ai fini les supernode , maintenant j' update le IP 

IF ( IUCL . EQ . IUDL) GO TO 80 

c if (isupd(i) .eq.O) go to 80 

J= JU ( IUCL+1 ) 

JJ=IP(J) 

IF ( JJ . EQ . 0 ) GO TO 70 
IP (L) =IP (JJ) 

IP(JJ) =L 
GO TO 80 
70 IP ( J) =L 
IP (L) =L 


c the following go to statement is equivalent to 2-nd nested loop 

c for factorization 

c 


80 IF (L . NE . LAST) GO TO 50 
90 DI (I) =1 . dO/DI ( I) 

IF (IUB.LT. IUA) GO TO 120 
DO 100 J=IUA, IUB 
100 UN ( J) =DI (JU(J) ) 

c addition 

if ( isupd (i ) . eq. 0 ) go to 120 
c if ( isupd (i) . eq . 0 ) go to 130 

J= JU ( IUA) 

JJ=IP(J) 

IF ( JJ. EQ . 0 ) GO TO 110 
IP (I) =IP (JJ) 

IP(JJ) =1 
GO TO 120 
110 IP(J)=I 
IP(I)=I 

120 IUP (I) =IUA 

130 CONTINUE 
return 
end 

subroutine numfa8 (n, ia, ja, ad, an, iu, ju, di,un, ip, iup, isupd, lopf ) 
implicit real+8 {a-h, o- z) 

dimension ia {*) , ja (*) , ad(*) , an ( * ) , iu(*) , ju(*) ,di (*) ,un(*) 
dimension ip {*), iup (*), isupd ( *) 


c 

c 


purpose : numerical factorization 
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c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c . 


This subroutine is called by spasolver.f 

this portion of numerical factorization has unrolling level 8 
Modif ications April 24,1995 

modification pour la rapidite October 5,1995 
DEFINITIONS 


output : 


n 

un 


input : ia (n+1) , ja (ncoef } , an (ncoef ) , ad (n) : given matrix A in RR(U)U. 

iu (n+1) , ju (ncoef 2) : structure of resulting matrix U in 

RR ( U ) 0 . 

order of matrices A and U. 
numerical values of the nonzeros of 
matrix U in RR(U)0. 
inverse of the diagonal matrix D. 
of dimension N. Chained lists of rows 
associated with each column, is differnt 
from the one in symbolic 
of dimension N. Auxiliary pointers to 
portions of rows. 

is used as the expanded accumulator. 


di 

working space : 


(ncoef 2 ) 

(n) 

ip 


lup 

di (n) 


DO 10 J=1,N 
10 IP(J)=0 

DO 130 1=1, N 
IH= 1+1 
IUA=IU ( I ) 

IUB=IU ( IH) -1 

cv IF(IUB.LT. IUA)GO TO 40 
CDIR$ IVDEP 

DO 20 J=IUA , IUB 
20 DI (JU(J) ) =0 .d0 
IAA=IA ( I ) 

IAB=IA ( IH) -1 

cv IF ( IAB . LT . IAA) GO TO 40 

CDIR$ IVDEP 

DO 30 J= IAA, IAB 
30 DI ( JA ( J) ) =AN ( J) 

40 DI (I) =AD (I) 

LAST=IP (I) 

IF (LAST. EQ. 0)GO TO 90 
LN=IP (LAST) 
cv loop =8 

50 L=LN 

LN=IP (L) 

m= min (i-1 , isupd(l) ) 
c iend= (m/loop) + loop 

iend= (m/8 ) *8 
c isbegin=l 

isend=l+iend-l 
c IUCL=IUP (isbegin) 

iucl=iup ( 1 ) 

iucl=iucl 

IUDL=IU ( isbegin+1 ) -1 
iudl=iu (1+1) -1 
iup (1) =iucl + 1 
length=IUDL- IUCL+1 
c do is= isbegin, isend, 8 

do is=l , isend, 8 
IUC2=IU ( is + 2 ) -length 
IUC3=IU ( is+3 ) -length 
IUC4 = IU ( is+4 ) -length 
IUC5 = IU ( is + 5 ) -length 
IUC6=IU (is+6 ) -length 
IUC7=IU ( is+7 ) -length 
iuc8=IU ( is + 8 ) -length 
UM1=UN ( IUCL) *DI (is) 

UM2=UN ( IUC2 ) *DI (is+1) 

UM3=UN ( IUC3 ) *DI (is+2) 

UM4=UN ( IUC4 ) *DI (is+3) 

UM5=UN { IUC5 ) *DI (is+4) 

UM6 =UN ( IUC6 ) *DI (is + 5) 

UM7 =UN ( IUC7 ) *DI (is+6) 

UM8 =UN ( IUC8 ) *DI (is + 7) 

CDIR$ IVDEP 

DO 68 J=IUCL, IUDL 
JJ=JU(J) 

68 DI (JJ) «DI (JJ) -UN ( J) *UMl-un (iuc2-iucl+j ) *um2 
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+ -un(iuc3-iucl+j) *um3 -un (iuc4 - iucl+ j ) *um4 

+ -un (iuc5-iucl + j ) *um5-un (iuc6- iucl+ j ) *um6 

+ -un ( iuc7-iucl+j ) *um7-un (iuc8- iucl+j ) *um8 

UN(IUCL) =UM1 
un ( iuc2 ) =um2 
un ( iuc3 ) =um3 
un (iuc4 ) =um4 
un ( iuc5 ) =um5 
un ( iuc6 ) =um6 
un ( iuc7) =um7 
un ( iuc8 ) =um8 
iucl=iu (is+9) -length 
iudl=iucl+length-l 
enddo 

c pour loop of level 7,6,5, 4, 3,2,1 

iloop=m- iend 

if (iloop.eq.O) go to 77 
go to(l, 2,3, 4, 5,6,7) Hoop 
go to 77 

c @@@@@@@@@@ @@@@@ 

1 is=isend+l 

UM1 =UN ( IUCL ) *DI (is) 

CDIR$ IVDEP 

DO 61 J=IUCL , IUDL 
JJ=JU(J) 

61 DI ( JJ) =DI { JJ) -UN ( J) *UM1 
UN (IUCL) =UM1 

go to 77 
c @@@@@@@@@ 

2 is=isend+l 
IUC2=IU ( is+2 ) -length 
UM1=UN ( IUCL) +DI (is) 

UM2=UN ( IUC2 ) *DI (is+1) 

CDIR$ IVDEP 

DO 62 J= IUCL, IUDL 
JJ=JU(J) 

62 DI ( JJ) =DI ( JJ) -UN ( J) *UMl-un (iuc2-iucL+j ) *um2 
UN ( IUCL) =UM1 

vin ( IUC2 ) =um2 
go to 77 
c 

3 is=isend+l 
IUC2=IU ( is+2 ) -length 
IUC3=IU ( is+3 ) -length 
UM1=UN ( IUCL) *DI ( is ) 

UM2 =UN ( IUC2 ) *DI (is + 1) 

UM3 =UN ( IUC3 ) *DI (is + 2) 

CDIR$ IVDEP 

DO 63 J= IUCL, IUDL 
JJ=JU(J) 

63 DI (JJ) =DI ( JJ) -UN ( J) *UMl-un (iuc2-iucl+j ) *um2 

+ -un ( iuc3 - iucl+ j ) *um3 

UN (IUCL) =UM1 
un (iuc2 ) =um2 
un ( iuc3 ) =um3 
go to 77 

4 is=isend+l 
IUC2=IU ( is+2 ) -length 
IUC3=IU ( is+3 ) -length 
IUC4=IU ( is+4 ) -length 
UM1=UN ( IUCL) *DI (is) 

UM2=UN ( IUC2 ) *DI (is+1) 

UM3=UN ( IUC3 ) *DI (is+2) 

UM4=UN ( IUC4 ) *DI (is+3) 

CDIR$ IVDEP 

DO 64 J= IUCL, IUDL 
JJ=JU(J) 

64 DI ( JJ) =DI ( JJ) -UN ( J) *UMl-un (iuc2-iucl+j ) *um2 

+ -un (iuc3 -iucl+j ) *um3-un (iuc4-iucl+j ) *um4 

UN (IUCL) =UM1 
un ( iuc2 ) =um2 
un(iuc3) =um3 
un(iuc4) =um4 
go to 77 


5 


is=isend+l 

IUC2=IU ( is+2 ) -length 
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IUC3= IU ( is+3 ) -length 
IUC4=IU ( is +4 ) -length 
IUC5=IU { is+5 } -length 
UM1=UN ( IUCL) *DI (is) 

UM2 =UN ( IUC2 ) *DI (is + 1) 

UM3=UN ( IUC3 ) *DI (is+2) 

UM4=UN ( IUC4 ) *DI (is+3) 

UM5=UN (IUC5) *DI (is+4) 

CDIR$ IVDEP 

DO 65 J= IUCL , IUDL 
JJ=JU(J) 

65 DI ( JJ) =DI ( JJ) -UN ( J) *UM1 - un (iuc2 - iucl + j ) *um2 

+ -un (iuc3 - iucl+ j ) *um3-un (iuc4-iucl + j ) *um4 

+ -un (iuc5-iucl+ j ) *um5 

UN (IUCL) =UM1 
un ( iuc2 ) =um2 
un ( iuc3 ) =um3 
un ( iuc4 ) =um4 
un ( iuc5 ) =um5 
go to 77 

6 is=isend+l 
IUC2=IU (is+2 ) -length 
IUC3=IU (is+3 ) -length 
IUC4=IU (is+4 ) -length 
IUC5=IU ( is+5 ) -length 
IUC6=IU (is+6 ) -length 
UM1 =UN ( IUCL ) *DI (is) 

UM2 =UN ( IUC2 ) *DI (is + 1) 

UM3=UN ( IUC3 ) *DI (is+2) 

UM4=UN ( IUC4 ) *DI ( is+3 ) 

UM5=UN ( IUC5 ) *DI (is+4) 

UM6=UN ( IUC6 ) *DI (is+5) 

CDIR$ IVDEP 

DO 66 J=IUCL , IUDL 
JJ=JU(J) 

66 DI (JJ) =DI ( JJ) -UN ( J) +UMl-un (iuc2-iucl+j ) *um2 

+ -un ( iuc3-iucl+j ) *um3-un ( iuc4-iucl+j ) *um4 

+ -un ( iuc5- iucl+ j ) +um5-un (iuc6- iucl+ j ) *um6 

UN (IUCL) =UM1 
un(iuc2) =um2 
\in (iuc3 ) =um3 
un ( iuc4 ) =um4 
un (iuc5) =um5 
un ( iuc6 ) =um6 
go to 77 
c ®®<s }@©@ @@@@ 

7 is=isend+l 
IUC2=IU ( is+2 ) -length 
IUC3=IU (is+3 ) -length 
IUC4=IU ( is+4 ) -length 
IUC5=IU (is+5 ) -length 
IUC6=IU ( is+6 ) -length 
IUC7=IU ( is+7 ) -length 
UM1=UN ( IUCL) *DI ( is ) 

UM2 =UN ( IUC2 ) *DI (is + 1) 

UM3=UN (IUC3 ) *DI (is+2) 

UM4=UN ( IUC4 ) +DI ( is + 3 ) 

UM5=UN ( IUC5 ) *DI ( is + 4 ) 

UM6=UN(IUC6) *DI (is+5) 

UM7=UN ( IUC7 ) *DI (is+6) 

CDIR$ IVDEP 

DO 67 J= IUCL, IUDL 

JJ=JU(J) . , 

67 DI (JJ) =DI (JJ) -UN ( J) *UMl-un(iuc2-iucl+i ) *um2 

+ -un (iuc3-iucl+j ) *um3 -un (iuc4 - iucl+ j ) *um4 

+ -un(iuc5-iucl+j) *um5-un (iuc6 - iucl+ j ) *um6 

+ -un (iuc7-iucl+j ) *um7 

UN (IUCL) =UM1 
un ( iuc2 ) =um2 
un (iuc3 ) =um3 
un ( iuc4 ) =um4 
un(iuc5) =um5 
>an (iuc6) =um6 
un ( iuc7) =um7 
go to 77 


77 


continue 

if (iucl . eq. iudl) go to 80 
j =ju (iucl + 1) 
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70 


JJ=IP(J) 

IF ( JJ . EQ . 0 ) GO TO 70 
IP ( L) =IP (JJ) 

IP ( JJ) = L 
GO TO 80 
IP (J) =L 
IP (L) =L 
80 I F ( L . NE . LAST ) GO TO 50 
90 DI (I) =1 . dO/DI ( I ) 

IF (IUB.LT. IUA) GO TO 120 
CDIR$ IVDEP 

DO 100 J=IUA, IUB 
100 UN ( J) =DI (JU{J) ) 

if ( isupd ( i) . eq . 0) go to 130 
J= JU ( IUA) 

JJ=IP ( J) 

IF ( JJ . EQ . 0 ) GO TO 110 
IP (I) =IP (JJ) 

IP ( JJ) =1 
GO TO 120 
110 IP(J)=I 
IP(I) =1 

120 IUP ( I ) =IUA 
130 CONTINUE 
return 
end 

0 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


subroutine REORD (n, ncof f , nreord, mtota, mtoti , a, iq, neig) 

real*8 a(l) 

integer IQ(1) 

if (2*ncoff .gt .mtota) then 

write (*,*)' REORD: increase MTOTA to: 2*ncoff = ' # 2*ncoff 

stop 

endif 

if (3*ncoff+7*n+5 .gt .mtoti) then 

write (*,*) 'REORD: increase MTOTI to: 3*ncof f +7*n+5 = 

1 3*ncof f +7*n+5 
stop 
endif 

if (nreord . ne . 0 ) then 
call cputime (timeO) 

call ireadO (n, ncof f , iq (1) , iq (1+n) , iq (2*n+l) , iq ( 3*n+2) , 

1 iq (4*n+3 ) , iq(5*n+4) , iq (5*n+4+ncof f ) ) 
call cputime (timel) 
rewind (18) 

write (18) ( iq ( i ) , i = 5*n+4+ncof f , 5*n+3+3*ncof f ) 
call genmmd (n, iq(4*n+3) r iq ( 5*n+4+ncof f ) , iq(l+n) , 

4 iq (6*n+5+3*ncof f ) , iq(2*n+l) , 

3 iq (3*n+2) , iq(l) , iq ( 5*n+4+3 *ncof f ) , nof sub, maxcon, nterms) 
rewind (18) 

read (18) (iq(i) , i=5*n+4+ncof f # 5 +n+3 +3+ncof f ) 
call cputime ( time2 ) 

call getnewk (n, iq(l) , iq(l+n) , iq(2*n+l) , iq(3*n+2) ,iq(4*n+3) , 
3iq ( 5*n+4+ncof f ) , iq (5*n+4) ) 

call copyk (n, iq ( 1 ) , iq{l+n) , iq(2*n+l) , iq(5*n+4) , iq(5*n+4 + 

4 ncof f ) ,a(l) ,a(l+ncoff) , ncoff) 

call copydb (n, iq(l+n) ,a(l) # a(l+n) ,a{2*n+l) ,a(3*n+l) ; neig) 
9988 continue 

write (23 , *) ' CPU to get MD reordering = time2 -timel 

endif 

return 

end 

£+*++*+**++*+++++**+*+********++****++*+******+*+**+**** 

subroutine ireadO (n, ncoff , ia, icol , xls, Is, xadj , ja, adjncy) 
Integer adjncy(l) ,ia(l) ,ls(l) , icol (1) ,xla (1) ,xadj (1) , ja(l) 
rewind ( 15 ) 

read{15) (ia(i) ,i=l,n) 
rewind ( 11) 

read (11) ( j a (i) , i = l f ncof f ) 

c write ( * , *) 'IREADO IA =' , 

c write (*,*) 'IREADO JA =' , 

do 10 i = 1 , n 
icol ( i) =0 


( ia (i) , i = l, 10) 

( j a ( i) , i=l , 10) 
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10 


ls(i) = 0 

xls (i) = 0 

xadj (i) =0 

xadj (1+n) = 0 

do 11 i = 1,2+ncoff 

11 adjncy(i) = 0 
do 12 i = l,n-l 

do 13 j = ia (i) , ia (i + 1) -1 

13 icol ( j a ( j ) ) = icol(ja(j)) + 1 

12 continue 
icol sum = 0 
xadj (1) =1 

Is (1) = icol (1) 

do 14 i = 2 , n 

xadj(i) = ia(i) + ls(i-l) 

14 Is ( i ) = ls(i-l) + icol (i ) 

xadj (1+n) = xadj (n) + icol (n) 

c write (*,*)' xadj (1+n) = xadj (1+n) 

do 15 i = 1 , n -1 

iadjO = xadj (i) + icol(i) - ia(i) 

CDIR$ IVDEP 

do 16 j = ia(i), ia(i+l) - 1 
jj = j a ( j ) 

adjncy (iadj 0+ j ) = 33 
adjncy (xadj (j j ) +xls (jj ) ) = i 

xls ( j j ) = xls ( j j ) + 1 

16 continue 

15 continue 
return 
end 

c 



c 

subroutine getnewk (n, ia, perm, iu, iup, xadj , adjncy, ju) 
integer ia (1) , perm (1) , iup (1) , xadj (1) , adjncy (1) , ju (1) , iu (1) 
do 1 i = 1 , n 
1 iup (perm (i) ) = i 

iu (1) = 1 

i count t = 0 
do 10 i = 1 , n 
iO = perm(i) 
icount = 0 

do 20 j = xadj (iO) , xadj (iO+ 1 ) - l 

jj = iup (adjncy (j ) ) 
if ( j j . le . i) go to 20 
ju (icountt+1) = jj 
icountt = icountt + 1 
icount = icount + 1 
20 continue 

iu(i+l) = iu(i) + icount 
10 continue 

call transa (n, n, iu, ju, iup, adjncy) 
return 
end 
c 

c 

c 

subroutine copvk (n, ia , perm, iu,ju,ja,un,an, ncof f ) 
integer ia (1) , iu ( 1) ,ja(l) , j u { X ) , perm(l) 
real*8 an(l),un(l) 
rewind ( 11) 

read(ll) ( j a ( i ) ,i = l,ncoff) 
rewind (12 ) 

read(12) (an(i) ,i=l,ncoff) 
rewind ( 15 } 

read( 15 ) (ia(i) ,i=l,n) 
do 200 1 = l,n-l 
10 = perm(i) 

do 220 J = iu ( I) , iu ( 1 + 1 ) - 1 
JO = perm ( ju ( J) ) 
ijO = iO 
i j 00 = jO 
if (jO. LT . iO ) then 
ijO = jO 
i j 00 = iO 
endif 


4 b 


CDIR$ IVDEP 


do 230 jj = ia (ijO) , ia (ijO+ 1) -1 
if ( ja ( j j ) .NE . ij 00 ) go to 230 



230 

220 

200 


un(J) = an(JJ) 
go to 220 
continue 
continue 
continue 
rewind ( 11 ) 

write (11) (ju(i) ,i=l,ncoff) 
rewind ( 12 ) 

write(12) (un(i) ,i=l,ncoff) 
rewind { 15 ) 

write (15) (iu(i) ,i=l,n+l) 


c 

write ( * , * ) 



c 

write (* , *) 

PERM =' 

“ (perm(i) , i=l # 10) " 

c 

write ( * , * ) 

IA 

, (IA(I) ,i=l,10) 

c 

write (* , * ) 

JA 

, (JA(i) , i=l, 10) 

c 

write ( * , * ) 

IU 

, (IU(I) ,i=l,10) 

c 

write (* , * ) 

' JU 

, (JU(i) , i=l,10) 

c 

write ( * , * ) 

' AN 

, (anti) , i=l, 10) 

c 

write ( * , * ) 

'UN 

, (un ( i) , i=l , 10) 

c 

write (* , *) 

/ 



return 

end 

c 

c 

c 


subroutine copydb (n, perm, diag, b, diagO , bO # neig) 
integer perm(l) 

real *8 diag (1) , b (1) , diagO (1) , bO (1) 
rewind {14 } 

read(14) (bO { i ) , i=l , n) 
rewind (13 ) 

read(l3) (diagO(i) ,i=l,n) 
do 100 I = 1 , n 
10 = perm(i) 
diag(i) = diagO (10) 
b ( i ) = b0{10) 

100 continue 

rewind (13 ) 

write (13) (diag(i) , i=l,n) 
rewind ( 14 ) 

write(14) (b(i) ,i=l,n) 
rewind (18 ) 

write ( 18 ) (perm (i) , i=l , n) 
if (neig . gt . 0) then 
rewind ( 10 ) 

read (10) (diagO (i) , i=l,n) 
do I = 1 , n 

10 = perm(i) 
diag(i) = diagO (10) 
enddo 
rewind ( 10) 

write(10) (diag(i) ,i=l,n) 

endif 

return 

end 


c 



c — 
c 

subroutine transa9 (n # m, ia , 

ja f iat, jat) 


implicit real*8 (a-h^o-z) 
integer ia (1) ,ja(l) , iat (1) 

, jat (l) 

10 

mh = m + 1 
nh = n + 1 
do 10 i = 2, mh 
iat(i) = 0 


CDIR$ 

iab = ia (nh) - 1 
do 21 jj = 1, n 
IVDEP 



do 20 i = ia { j j ) , ia(j j+1) - 

1 

20 

j = ja (i) +2 

iat ( j ) = iat ( j ) +1 

continue 


21 

continue 



iat(l) = 1 
iat (2 ) = 1 

if { m.eq.l ) go to 40 



4 $ 



30 

40 


do 3 0 i = 3 , mh 
iat(i) = iat(i) + iat(i-l) 
do 60 i = 1, n 
iaa = ia(i) 
iab = ia(i + l) - 1 
if { iab. It. iaa } go to 60 
CDIR$ IVDEP 

do 50 jp = iaa, iab 
j = ja ( jp) + 1 
k = iat(j) 
jat{ k ) = i 

50 iat(j) = iat(j) + 1 

60 continue 

call tran2 (n, m, iat , jat, ia, ja) 
return 
end 

subroutine tran9 (n, m, ia, ja, iat, jat) 
implicit real+8 (a-h,o-z) 
integer ia (1) , ja (1) , iat (1) , jat (1) 
mh = m + 1 

nh = n + 1 

do 1 0 i = 2 , mh 
10 iat (i) — 0 

iab = ia{nh) - 1 
do 21 jj =1, n 
CDIR$ IVDEP 

do 20 i = ia ( j j ) , ia { j j +1) -1 

j = ja (i) + 2 

iat { j ) = iat(j) + 1 

20 continue 

21 continue 


iat ( 1 } =1 
iat (2) = 1 

if ( m.eq.l ) go to 40 
do 30 i = 3, mh 

30 iat(i) = iat(i) + iat(i-l) 

40 do 60 i = 1, n 

iaa = ia(i) 
iab = ia(i+l) - 1 
if ( iab. It. iaa ) go to 60 
CDIR$ IVDEP 

do 50 jp = iaa, iab 
j = ja ( jp) + 1 
k = iat(j) 
jat( k } = i 

50 iat(j) = iat(j) + 1 

60 continue 


return 

end 

SUBROUTINE 

1 

1 

INTEGER 

1 

INTEGER 

INTEGER 

1 

1 

mquinup 
mqinLm 
time up 
timelm 
MAXINT 
DELTA = 


GENMMD ( NEQNS , XADJ, ADJNCY, PERM, INVP, 
DHEAD, QSIZE, LLIST, MARKER, 

NOFSUB , MAXCON, NT E RMS ) 

ADJNCY (1), DHEAD (1) , INVP(l) , LLIST (1) 

MARKER ( 1 ) , PERM ( 1 ) , QSIZE{1) 

XADJ (1) 

DELTA , EHEAD , I , MAXINT, MDEG , 

MDLMT , MDNODE, NEQNS , NEXTMD , NOFSUB, 
NUM, TAG 


= 0 
= 0 
= 0.0 
= 0.0 

10000000+1000000000 


0 


IF ( NEQNS .LE. 0 ) RETURN 
NOFSUB = 0 

CALL MMDINT { NEQNS, XADJ, ADJNCY, DHEAD, INVP, PERM 
1 QSIZE, LLIST, MARKER ) 

NUM = 1 

NEXTMD = DHEAD (1) 

100 CONTINUE 

IF ( NEXTMD .LE. 0 ) GO TO 200 
MDNODE = NEXTMD 
NEXTMD = INVP (MDNODE) 

MARKER (MDNODE) = MAXINT 
INVP (MDNODE) = - NUM 
NUM = NUM + 1 
GO TO 100 

200 CONTINUE 



GO TO 1000 


300 


400 


500 


600 


700 

800 


1 

1 


900 


1 

1 


1000 


IF ( NUM .GT. NEQNS ) 

TAG = 1 
DHEAD ( 1 ) =0 

MDEG = 2 
CONTINUE 

IF ( DHEAD (MDEG) . GT . 0 ) GO TO 400 
MDEG = MDEG + 1 
GO TO 300 
CONTINUE 

MDLMT = MDEG + DELTA 

EHEAD = 0 

CONTINUE 

MDNODE = DHEAD (MDEG) 

IF ( MDNODE .GT. 0 ) GO TO 600 
MDEG = MDEG + 1 

IF ( MDEG .GT. MDLMT ) GO TO 900 
GO TO 500 


CONTINUE 

NEXTMD = INVP (MDNODE) 

DHEAD (MDEG) = NEXTMD 

IF ( NEXTMD .GT. 0 ) PERM (NEXTMD) = - MDEG 
INVP (MDNODE) = - NUM 

NOFSUB - NOFSUB + MDEG + QS I ZE (MDNODE) - 2 
IF ( NUM+QS I ZE (MDNODE) . GT . NEQNS ) GO TO 1000 
TAG = TAG + 1 

IF ( TAG .LT. MAXINT ) GO TO 800 
TAG = 1 

DO 700 1=1, NEQNS 

IF ( MARKER (I) . LT . MAXINT ) MARKER ( I ) = 

CONTINUE 
CONTINUE 

CALL MMDELM ( MDNODE , XADJ, ADJNCY, DHEAD, INVP, 
PERM , QSIZE , LLIST , MARKER, MAXINT, 
TAG ) 

NUM = NUM + QSIZE (MDNODE) 

LLIST (MDNODE) = EHEAD 
EHEAD = MDNODE 

IF ( DELTA . GE. 0 ) GO TO 500 
CONTINUE 

IF ( NUM .GT. NEQNS ) GO TO 1000 

CALL MMDUPD ( EHEAD, NEQNS, XADJ, ADJNCY, DELTA, MDEG 
DHEAD, INVP, PERM, QSIZE, LLIST, MARKER 
MAXINT, TAG ) 

GO TO 300 
CONTINUE 

CALL MMDNUM ( NEQNS, PERM, INVP, QSIZE ) 

RETURN 


END 

SUBROUTINE 


1 

1 

1 

1 


INTEGER 

INTEGER 

INTEGER 


MMDUPD ( EHEAD, NEQNS, XADJ, ADJNCY, DELTA, 
MDEG, DHEAD, DFORW, DBAKW, QSIZE, 
LLIST, MARKER, MAXINT, TAG ) 


ADJNCY (1), DBAKW (1 
LLIST (1) , MARKER ( 1 ) , 

XADJ ( 1) 

DEG , DEG0 , DELTA , 

ENODE , FNODE , I 

ISTRT , J , JSTOP , 

MAXINT, MDEG , MDEG0 , 

NEQNS , NODE , Q2HEAD , 


DFORW (1) 
QSIZE (1) 


EHEAD , 
IQ2 

JSTRT , 
MTAG , 
QXHEAD , 


DHEAD ( 1 ) 


ELMNT 

ISTOP 

LINK 

NABOR 

TAG 


MDEG0 = MDEG + DELTA 
ELMNT = EHEAD 
100 CONTINUE 

IF ( ELMNT .LE. 0 ) RETURN 
MTAG = TAG + MDEG0 

IF ( MTAG .LT. MAXINT ) GO TO 300 
TAG = 1 

DO 200 1=1, NEQNS 

IF ( MARKER ( I ) . LT . MAXINT ) MARKER (I) 

200 CONTINUE 

MTAG = TAG + MDEG0 
300 CONTINUE 

Q2HEAD = 0 

QXHEAD = 0 

DEG0 = 0 
LINK = ELMNT 
400 CONTINUE 

ISTRT = XADJ (LINK) 

ISTOP = XADJ (LINK+1) - 1 
DO 700 I = ISTRT, ISTOP 
ENODE = ADJNCY (I) 


0 


0 




LINK = - ENODE 

IF ( ENODE ) 400, 800, 500 



500 

CONTINUE 

IF ( QSIZE (ENODE) . EQ . 0 ) GO 

TO 

700 


DEGO = DEGO + QSIZE (ENODE) 

MARKER (ENODE) = MTAG 

IF ( DBAKW (ENODE) .NE. 0 ) 

GO 

TO 700 


IF ( DFORW (ENODE) .EQ. 

2 ) 

GO TO 600 


LLIST( ENODE) = QXHEAD 



QXHEAD = ENODE 
GO TO 700 



600 

CONTINUE 

LLIST (ENODE) = Q2HEAD 
Q2HEAD = ENODE 



700 

CONTINUE 



800 

CONTINUE 
ENODE = Q2HEAD 
IQ2 = 1 



900 

CONTINUE 




IF ( ENODE .LE. 0 ) GO TO 1500 
IF ( DBAKW (ENODE) .NE. 0 ) GO TO 2200 
TAG = TAG + 1 
DEG = DEGO 
ISTRT = XADJ (ENODE) 

NABOR = ADJNCY( ISTRT) 

IF ( NABOR . EQ. ELMNT ) NABOR = ADJNCY ( ISTRT+1 ) 
LINK = NABOR 

IF ( DFORW (NABOR) . LT . 0 ) GO TO 1000 

DEG = DEG + QSIZE (NABOR) 

GO TO 2100 

1000 CONTINUE 

ISTRT = XADJ (LINK) 

ISTOP = XADJ ( LINK+ 1 ) - 1 

DO 1400 I = ISTRT, ISTOP 
NODE = ADJNCY (I) 

LINK = - NODE 

IF ( NODE .EQ. ENODE ) GO TO 1400 
IF ( NODE ) 1000, 2100, 1100 

1100 CONTINUE 

IF ( QSIZE (NODE) . EQ . 0 ) GO TO 1400 
IF ( MARKER (NODE) .GE. TAG ) GO TO 1200 
MARKER (NODE) = TAG 
DEG = DEG + QSIZE (NODE) 

GO TO 1400 

1200 CONTINUE 

IF ( DBAKW (NODE) .NE. 0 ) GO TO 1400 

IF ( DFORW (NODE) .NE. 2 ) GO TO 1300 

QSIZE (ENODE) = QSIZE (ENODE) + 

1 QSIZE (NODE) 

QSIZE (NODE) = 0 
MARKER (NODE) = MAXINT 
DFORW (NODE) = - ENODE 

DBAKW (NODE) = - MAXINT 

GO TO 1400 

1300 CONTINUE 

IF { DBAKW (NODE) .EQ.O ) 

1 DBAKW (NODE) = - MAXINT 

1400 CONTINUE 

GO TO 2100 

1500 CONTINUE 

ENODE = QXHEAD 
IQ2 = 0 

1600 CONTINUE 

IF ( ENODE . LE. 0 ) GO TO 2300 
IF ( DBAKW (ENODE) .NE. 0 ) GO TO 2200 
TAG = TAG + 1 
DEG = DEGO 
ISTRT = XADJ (ENODE) 

ISTOP = XADJ (ENODE+1) - 1 
DO 2000 I = ISTRT, ISTOP 
NABOR = ADJNCY (I) 

IF ( NABOR . EQ. 0 ) GO TO 2100 
IF ( MARKER (NABOR) .GE. TAG ) GO TO 2000 
MARKER (NABOR) = TAG 
LINK = NABOR 

IF ( DFORW (NABOR) .LT. 0 ) GO TO 1700 
DEG = DEG + QSIZE (NABOR) 

GO TO 2000 

1700 CONTINUE 

JSTRT = XADJ (LINK) 


43 



JSTOP = XADJ (LINK+1) - 1 
DO 1900 J = JSTRT, JSTOP 
NODE = ADJNCY(J) 

LINK = - NODE 

IF ( NODE ) 1700, 2000, 1800 

1800 CONTINUE 

IF ( MARKER (NODE) .GE. TAG ) 

1 GO TO 1900 

MARKER (NODE) = TAG 
DEG = DEG + QSIZE (NODE) 

1900 CONTINUE 

2000 CONTINUE 

2100 CONTINUE 

DEG = DEG - QSIZE (ENODE) + 1 
FNODE = DHEAD (DEG) 

DFORW( ENODE) = FNODE 
DBAKW (ENODE) = - DEG 

IF ( FNODE .GT. 0 ) DBAKW (FNODE) = ENODE 
DHEAD (DEG) = ENODE 
IF ( DEG .LT. MDEG ) MDEG = DEG 
2200 CONTINUE 

ENODE = LLIST (ENODE) 

IF ( IQ2 . EQ. 1 ) GO TO 900 
GO TO 1600 


2300 CONTINUE 

TAG = MTAG 

ELMNT = LLIST (ELMNT) 

GO TO 100 


1 

1 

1 


1 

1 

1 

1 


100 


END 

SUBROUTINE MMDELM ( MDNODE , XADJ, ADJNCY, DHEAD, DFORW , 

DBAKW, QSIZE, LLIST, MARKER, MAXINT 
TAG ) 

INTEGER ADJNCY ( 1) , DBAKW ( 1 ) , DFORW(l) , DHEAD ( 1 ) 

LLIST (1) , MARKER ( 1 ) , QSIZE (1) 

INTEGER XADJ ( 1 ) 


INTEGER 

ELMNT , I 

, ISTOP 

, ISTRT , 

J 


JSTOP , JSTRT 

, LINK 

, MAXINT, 

MDNODE 


NABOR , NODE 

, NPV 

, NQNBRS , 

NXNODE 


P VNODE , RLMT 
XQNBR 

, RLOC 

, RNODE , 

TAG 

MARKER (MDNODE) = TAG 




ISTRT * 

XADJ (MDNODE) 




ISTOP = 

XADJ (MDNODE+1) - 

1 




ELMNT = 0 
RLOC = ISTRT 
RLMT = ISTOP 

DO 200 I = ISTRT, ISTOP 
NABOR = ADJNCY (I) 

IF ( NABOR .EQ. 0 ) GO TO 300 

IF ( MARKER (NABOR) .GE. TAG ) GO TO 200 
MARKER (NABOR) = TAG 

IF ( DFORW (NABOR) .LT. 0 ) GO TO 100 
ADJNCY (RLOC) = NABOR 
RLOC = RLOC + 1 
GO TO 200 
CONTINUE 

LLIST (NABOR) = ELMNT 
ELMNT = NABOR 


200 

300 


400 


500 


1 


600 


CONTINUE 

CONTINUE 

IF ( ELMNT . LE. 0 ) GO TO 1000 
ADJNCY (RLMT) = - ELMNT 
LINK = ELMNT 
CONTINUE 

JSTRT = XADJ (LINK) 

JSTOP = XADJ (LINK+1) - 1 
DO 800 J = JSTRT, JSTOP 
NODE = ADJNCY (J) 

LINK = - NODE 

IF ( NODE ) 400, 900, 500 

CONTINUE 

IF ( MARKER (NODE) .GE. TAG .OR. 

DFORW (NODE) . LT. 0 ) GO TO 800 
MARKER (NODE) = TAG 
CONTINUE 

IF ( RLOC .LT. RLMT ) GO TO 700 
LINK = - ADJNCY (RLMT) 

RLOC = XADJ (LINK) 

RLMT = XADJ (LINK+1) - 1 
GO TO 600 




700 

CONTINUE 

— 


ADJNCY {RLOC) = NODE 
RLOC = RLOC + 1 


800 

CONTINUE 


900 

CONTINUE 

ELMNT = LLIST (ELMNT) 

— 


GO TO 300 


1000 

CONTINUE 

IF ( RLOC .LE. RLMT ) ADJNCY (RLOC) = 
LINK = MDNODE 


1100 

CONTINUE 


ISTRT = XADJ (LINK) 

ISTOP = XADJ (LINK+1) - 1 
DO 1700 I = ISTRT, ISTOP 
RNODE = ADJNCY(I) 

LINK = - RNODE 

IF ( RNODE ) 1100, 1800, 1200 

1200 CONTINUE 

P VNODE = DBAKW (RNODE) 

IF ( P VNODE .EQ. 0 .OR. 

1 P VNODE .EQ. (-MAXINT) ) GO TO 1300 

NXNODE = DFORW (RNODE) 

IF ( NXNODE .GT. 0 ) DBAKW (NXNODE) = PVNODE 
IF ( PVNODE .GT. 0 ) DFORW (PVNODE) = NXNODE 
NPV = - PVNODE 

IF ( PVNODE .LT. 0 ) DHEAD (NPV) = NXNODE 
1300 CONTINUE 

JSTRT = XADJ (RNODE) 

JSTOP = XADJ (RNODE+1) - 1 
XQNBR = JSTRT 
DO 1400 J = JSTRT, JSTOP 
NABOR = ADJNCY(J) 

IF ( NABOR . EQ. 0 ) GO TO 1500 

IF ( MARKER (NABOR) .GE. TAG ) GO TO 1400 
ADJNCY (XQNBR) = NABOR 
XQNBR = XQNBR + 1 

1400 CONTINUE 

1500 CONTINUE 

NQNBRS = XQNBR - JSTRT 

IF ( NQNBRS .GT. 0 ) GO TO 1600 

QSIZE (MDNODE) = QSIZE (MDNODE) + QSIZE (RNODE) 

QSIZE (RNODE) = 0 

MARKER (RNODE) = MAXINT 

DFORW (RNODE) = - MDNODE 

DBAKW (RNODE) = - MAXINT 

GO TO 1700 

1600 CONTINUE 

DFORW (RNODE) = NQNBRS + 1 
DBAKW (RNODE) = 0 
AD JNCY (XQNBR) = MDNODE 
XQNBR = XQNBR + 1 

IF { XQNBR . LE . JSTOP ) ADJNCY (XQNBR) = 0 
1700 CONTINUE 

1800 CONTINUE 
RETURN 

END 

SUBROUTINE MMDINT ( NEQNS , XADJ, ADJNCY, DHEAD, DFORW, 

1 DBAKW, QSIZE, LLIST, MARKER ) 

INTEGER ADJNCY ( 1) , DBAKW ( 1 ) , DFORW(l) , DHEAD (1) , 

1 LLIST (1) , MARKER (1) , QSIZE(l) 

INTEGER XADJ ( 1 ) 

INTEGER FNODE , NDEG , NEQNS , NODE 
DO 100 NODE = 1, NEQNS 
DHEAD (NODE) = 0 
QSIZE (NODE) = 1 
MARKER (NODE) = 0 
LLIST (NODE) = 0 
100 CONTINUE 

DO 200 NODE = 1, NEQNS 

NDEG = XADJ (NODE+1) - XADJ (NODE) + 1 
FNODE = DHEAD (NDEG) 

DFORW (NODE) = FNODE 
DHEAD (NDEG) = NODE 

IF ( FNODE .GT. 0 ) DBAKW ( FNODE ) = NODE 

DBAKW (NODE) = - NDEG 
2 00 CONTINUE 

RETURN 

END 

SUBROUTINE MMDNUM ( NEQNS, PERM, INVP, QSIZE ) 

INTEGER INVP ( 1) , PERM(l) , QSIZE (1) 
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INTEGER 


FATHER, NEQNS , NEXTF , NODE , NQSIZE, 

1 NUM , ROOT 

DO 100 NODE = 1, NEQNS 
NQSIZE = QSIZE (NODE) 

IF ( NQSIZE .LE. 0 ) PERM (NODE) = INVP(NODE) 

IF ( NQSIZE . GT. 0 ) PERM (NODE) = - INVP(NODE) 

100 CONTINUE 

DO 500 NODE = 1, NEQNS 

IF ( PERM (NODE) - GT . 0 ) GO TO 500 
FATHER = NODE 
200 CONTINUE 

IF { PERM (FATHER) . GT . 0 ) GO TO 300 
FATHER = - PERM (FATHER) 

GO TO 200 

300 CONTINUE 

ROOT = FATHER 
NUM = PERM (ROOT) +■ 1 
INVP (NODE) = - NUM 
PERM (ROOT) = NUM 
FATHER = NODE 
400 CONTINUE 

NEXTF = - PERM (FATHER) 

IF ( NEXTF . LE. 0 ) GO TO 500 
PERM (FATHER) = - ROOT 
FATHER = NEXTF 
GO TO 400 

500 CONTINUE 

DO 600 NODE = 1, NEQNS 
NUM = - INVP (NODE) 

INVP (NODE) = NUM 
PERM (NUM) = NODE 
600 CONTINUE 

RETURN 

END 

C* +*+*****★***★********+**+**★*+****+★****★+*+*****★********+*****★+*++++* 

c LUMP MASS CASE + shift factor for the regular Lanczos 

c 

c H. Runesha dec 22, 1997 

c 

£★★ + ***★****★**★***★*★★★ + ★★★★★*★**★★★★****★★+*★★***★**★**★★*** + ★*** + + ★+★*★ 


SUBROUTINE SP2LAN (N , LANMAX , NEIG , UN, DI , IU, JU, W, wl , w2 , w4 , 

$TEM, EIG, ERR, VEC , ncof f , ncof 2 , dmas s , Q, lump, ishif t ) 
implicit real*8 (a-h,o-z) 

REAL *8 W(l) , VEC (lanmax, lanmax) ,EIG(1) , W4 (1) ,W2 (1) ,Q(N, LANMAX) 
REAL* 8 TEM(l) ,ERR(1) , wl (1) , dmass (1) , UN ( 1) , DI (1) 

INTEGER IU(1) , JU(1) 

COMMON /QIN/ QLAN, EPSOO 
common/ sp 2 com /nbuf (4) 

rewind ( 10 ) 

read(10) (dmass { i) , i=l , n) 

iam=0 

nodes=l 

NBLK = 1 

RTOL = 0 . OOOOOOOOOldO 
NNNN= 4 *NEIG 

I F ( NNNN . GE . LANMAX ) NNNN=LANMAX - 1 
MN=0 

DO 61 1=1, LANMAX 
EIG (I) =0 . OdO 
61 ERR ( I) =0 . OdO 

EPS=GETEPS (IBETA, IT, IRND) 

EPSOO=EPS 
EPS=dsqrt (EPSOO) 

EPSb=EPS* 5 . OdO 
wnor=0 . OdO 
k=l 

do 14 i=l,n 
w ( i) =1 . OdO 

c if (k . gt . 100 ) k = 1 

14 wnor=wnor+w (i) *w (i) 
wnor=l . OdO/dsqrt (wnor) 

DO 15 1=1, N 

W ( I ) =w ( i) *wnor 
Wl (i) =0 . OdO 

15 CONTINUE 
1501 CONTINUE 

do 9901 i=l,n 
w2 (i) =dmass (i) *w(i) 
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9901 


continue 

bet=0 . OdO 
do 99011 1 = 1, N 

99011 bet=bet+w2 (i) *w(i) 

bet=dsqrt (bet) 
do 99012 1=1, N 
W1 (i) =w(i) /BET 
w ( i) =0 . OdO 
Q ( i , 1 ) = Wl(i) 

99012 continue 

do 99013 1 = 1, n 

99013 w2 (i) =dmass (i) *W1 (i) 

Q **■*■■*■**★★*★+★★★****★*+★★++*** 

DO 50 JJ=1 , LANMAX-1 
DO 1001 1=1 , N 
1001 W4 ( I ) =W2 ( I ) 

Q ******************************************** 

call fbe (n, iu, ju, di , un, w2 , w4 , iopfb) 
c write (*,*)' JJ = ',jj 

c call node (nodes, iam,n,nbw, imod, 8, stif , z,y, W4,maxa, irow, icolg, 

c ItemO , 2 , zO) 

Q ***★★★■*•* + + *•*’■*•★** + ★★•*•***★★ + +★★****** + **'*'**•*"*■** 

DO 55 1=1, N 
W4 (I) =W4 (I) -W ( I) *BET 
55 CONTINUE 

ALF=DDOT (N, W2 , 1 , W4 , 1) 

DO 60 1=1 , N 

W4 (I) =W4 (I) -ALF*W1 (I) 

60 CONTINUE 

£+******+**+**+++*+*++*+****++***+***+*★+**+*+***★★****+★★+**★***** 

do 9902 i=l , n 

9902 w2 (i) =dmass (i) *w4 ( i) 

Q *■*+*★ + ■*■*★★*★■*■ + *★*'*'*++★ 

BET2 =DDOT ( N , W4 , 1 , W2 , 1 ) 
bet2=dsqrt (bet2) 

c write (*,*)' before call REORTH: JJ = ',jj 

CALL REORTH (N,W2, 

&W4,Q,EPS, LANMAX , TEM , JJ , BET2 ) 
do 9903 i=l , n 

9903 w2 ( i) =dmass ( i ) *w4 ( i) 

Q + + + *■*■★★*★★**★★★******★ 

BET2=DDOT(N,W4, 1,W2, 1) 

c write (*,*)' before 100 : JJ,beta = ',jj,beta 

bet2=dsqrt (bet2) 

100 CONTINUE 

DBET2=1 . 0d0/BET2 
DO 110 1=1, N 
W(I) =W1 (I) 

W1 (I) =W4 (I) *DBET2 
Q ( I , JJ+1 ) =W1(I) 

W2 (I) =W2 (I) *DBET2 
110 CONTINUE 

EIG(JJ) =ALF 
ERR ( JJ) =BET2 
BET=BET2 

c write ( * , * ) ’ JJ , ALF , BET2 , iam= ' ,jj,alf,bet2 

IF(JJ.LT.NNNN) GO TO 50 
c if (nodes . gt . 1) call mp_sync (nbuf (4 ) ) 

c write (+,*) ' ishift =' , ishift 

CALL JACOBIPi (N, NEIG, JJ, BET2 , W (1) ,Q,VEC,EIG, 

$ERR , UN , TEM , RTOL , LANMAX , DMASS , 

$W1 , N+l , NNNN, NSUC , ncof f , ncof 2 , iu, ju, di , ishift) 

IF (NSUC.EQ.l) GO TO 2211 
50 CONTINUE 

120 CONTINUE 

CALL JACOBIPi (N, NEIG, JJ- 1 , BET2 , W ( 1 ) ,Q,VEC,EIG, 

$ ERR , UN , TEM , RTOL , LANMAX , DMAS S , 

$W1, N+l, NNNN, NSUC, ncof f, ncof 2 , iu, ju,DI, ishift) 

2211 CONTINUE 


RETURN 

END 


c 




SUBROUTINE REORTH (N, P, R, Q, EPS , LANMAX , HH , NJ , BET2) 
implicit real*8 (a-h,o-z) 

REAL*8 P(1),R(1),Q(N, LANMAX) 



REAL* 8 HH ( 1 ) 

DO 10 K= 1 , NJ- 1 
HH (K) =0 . OdO 
DO 20 1=1, N 

HH (K) =HH ( K } +P(I) +Q(I,K) 

20 CONTINUE 

10 CONTINUE 

HNORM=HH (K) 

DO 40 K=1 , NJ- 1 

c if (abs (hh (k) ) . It . l . 0e-l0) go to 40 

DO 30 1=1, N 
R { I ) =R ( I ) -HH (K) *Q(I,K) 

30 CONTINUE 

40 CONTINUE 

RETURN 
END 
c 

^★★★★★★★★★★★ + ****1t + + * + * + + + + * + *** + + ** + + * + + ** + * + + * + * + *** + *** + *****’A , * ,1 ****'*’ + * 

c 

SUBROUTINE JACOBIPi (N, NEIG, NJ, BET2 , V, Q, VEC, 

$ EIG, ERR, AN, HH, RTOL, LANMAX, DMASS , 

$ VP1, NP1, NNNN, NSUC, ncof f , ncof 2 , ia, ja, ad, ishift) 

implicit real*8 (a-h,o-z) 

REAL* 8 DMASS (1) , Q (N, LANMAX) , VEC (lanmax, lanmax) 

$ , EIG (1) , ERR ( 1 ) , HH ( 1 ) ,VP1(1) ,V(l),AN(l),ad(l) 
integer ia(l),ja(l) 
common/sp2com/nbuf (4) 

COMMON /QIN/ QLAN, EPS 

COMMON /NQIN/ NQL 

COMMON /QJA/ SUB (1000) 

iam=0 

nodes =1 

NSUC=0 

DO 1 1=1, LANMAX 
DO 2 J=l, LANMAX 
2 VEC (I, J) =0 . OdO 

HH ( I ) =EIG ( I ) 

SUB (I) =ERR(I) 

VEC {I, I) =1 . OdO 

1 CONTINUE 

c if (nodes . gt . 1 ) call mp_sync (nbuf (4 ) ) 

CALL TQL2 (LANMAX , NJ , EIG , ERR , VEC , IRR , EPS } 
c write ( * , * ) ' IN JACOBQ : TQL2 is done I ! ! ! 

DO 20 1=1, NJ 
err ( i ) =0 . OdO 
do 21 j=l,nj 

21 err (i) =err (i) +abs (vec ( j , i) ) 

ERR ( I ) =err ( i } *0 . 0 0 5d0*ABS (BET2+VEC (NJ, I) /EIG (I) ) 
c write <* , * ) ' I , err (i) , vec (nj , i) =' , i, err (i) , vec (nj , i) 

2 0 CONTINUE 

DO 30 1=1, NEIG 

C IF (ERR ( I ) .GT. 0.000001. AND.NJ.LT. (LANMAX-1) ) GO TO 300 

IF (ERR ( I ) .GT. RTOL .AND. NJ . LT . (LANMAX-1)) GO TO 300 
30 CONTINUE 

c if (nodes . gt . 1 ) call mp_sync (nbuf (4 ) ) 

CALL VECTRA(N,NEIG, 0, VEC, Q, LANMAX, NJ,V) 
c write (*,*)' IN JACOBQ: VECTRA is done Ml! 

C NSUC: =l,succesful 

NSUC=1 

c DO 580 L=1 , NEIG 

call cputime (tOl) 
ninc=8*nodes 
c do L=l,neig 

c do i=n-nadd+l,n 

c q (i , L) =0 . OdO 

c enddo 

c enddo 

c write (*,*) ' NADD = ',nadd,n 

c endif 

rewind ( 15 ) 

read(15) (ia(i) ,i=l,l+n) 
rewind (11) 

read (11) ( ja (i) , i=l,ncoff ) 
rewind ( 12 ) 

read (12 ) (an(i) ,i=l,ncoff) 
rewind (13) 

read (13) (ad(i) , i=l,n) 

c write ( * , * ) '**PI**AD=', (ad (i) , i = l , n) 



c 

write ( * , * ) 

' 


c 

c 

write ( * , * ) 
write ( * , * ) 

r lanmax, 

n, neig 7 

c 

write ( * , * ) 

lanmax, n 

, neig 

(ia (i) , i=l , 1+n) 

c 

write ( * , * ) 

' IA = ' , 

c 

write ( * , * ) 

' JA = ' , 

( ja (i) , i=l , ncof f ) 

c 

write ( * , *) 

' AN = f , 

(an (i) , i=l,ncof f ) 

c 

write ( * , * ) 

9 AD = ' , 

(ad ( i) , i=l , n) 

c 

write ( * , * ) 

' DM = # , 

(dmass (i) , i=l, n) 

c 

write ( * , * ) 

7 EIG= 9 , 

(eig(i) , i=l, iq) 

c 

c5 10 

do 510 j= 
write ( * , * 

1 , neig 
) (q (i , j ) 

, i=l , n) 


N0RMCHECK2 7 


DO 580 L=X,neig 

c call mult spa (n, ia , ja , an, ad, q, v) 

c call mulmeiko (n, , , , ) 

c if (nine. gt . 8) call mp_sync (nbuf (4 ) ) 

c call sp2mul ( iam # nine , n, icolg, maxa, stif , q (1 , L) , vpl , v) 

call multspa (n, ia, ja, an, ad, q (1, L) , v) 
c write (*,*)' Q ( i , L) = ' , L, {q ( i , L) , i=l , n) 

VNORM=0. OdO 
DO 590 1=1, N 

590 VNORM=VNORM+V ( I ) *V(I) 

WNORM=0 . OdO 

do 2239 i=l,n 

vpl (I) =dmass (i) *Q(i,L) 

2239 continue 

c write ( * , * ) 'I, vpl= 7 , i, (vpl ( ii ) , ii=l, n) 

RT=1 . OdO/EIG (L) 

DO 600 1=1, N 

c erm=max (abs (v (i) ) , abs (rt*vpl (i) ) ) 

c error =abs ( v (i) -rt*vpl (i) ) /erm 

c if (error . gt . 0 . 9 5d0 . and. erm. gt . 0 . OOOOldO) then 

c write (*,*) 7 L,I,V(I),,=\L,I,v(i) , rt*vpl(i) , error , q (i, L) 

c endif 

V(I) =V(I) -RT*VP1 (I) 

600 continue 

c write(*,*) 'V(I)*' , (v(i),i=l,n) 

do 601 i = l,n 

601 WNORM= WNORM+V ( I ) * V ( I ) 

c if (iam. eq. 0) write {*,*) 'VNORM, WNORM = ' , vnorm, wnorm, L 

VNORM=DSQRT (VNORM) 

WNORM=DSQRT (WNORM) 

HH(L) = WNORM /VNORM 
580 CONTINUE 

c write ( * , * ) ' HH = 7 , (hh (i) , i=l , neig) 

call cputime(t02) 
t02=t02 - tOl 

WRITE (23, *) 7 *** K, * EIG* , *HERTZ *,* ERROR *,* NORM *** iam / 

if ( ishif t . eq. 0 ) then 
DO 700 K=1 , NEIG 

c HERTZ=1 . 0/ (2 . 0*3 . 14 15 92 7*DSQRT (EIG (K) ) ) 

HERTZ=1 . 0/ (2 . 0*3 . 1415927*DSQRT (DABS (EIG (K) ) ) ) 

WRITE (23 , 701) K, 1 . 0/EIG (K) , HERTZ , ERR ( K) , HH (K) 

701 FORMAT (2X, 15, 2X, 4E15 .7) 

700 CONTINUE 

else 

DO 705 K=1 , NEIG 
EIG (K) =1 . /EIG (K) -float (ishif t) 
c HERTZ=DSQRT (EIG (K) ) / (2 . 0*3 . 1415927) 

HERTZ=DSQRT (DABS (EIG (K) ) ) / (2 . 0*3 . 1415927) 
c WRITE (23, 701) K,l. 0/EIG (K) , HERTZ , ERR (K) , HH (K) 

WRITE (23 , 701) K, EIG (K) , HERTZ, ERR (K) , HH (K) 

705 CONTINUE 

endif 

NSUC=1 

if (iam. eq. 0) write (23 , *) 'JACOBIQ: Steps m IAM = ' , nj , iam 
RETURN 

300 NNNN=MIN0 ( LANMAX -1 , NJ+3* (NEIG- I) +4) 

NSUC=0 

DO 304 1=1, LANMAX 
EIG ( I ) =HH ( I ) 



304 ERR (I) =SUB (I) 

RETURN 

END 

C 

£+***★★★★★★★★ + + + + + + ***★ + + ★ + **★* + ** + **** + * + ** + + ** + * + ** + ******* + ***.*** 

c 

SUBROUTINE TQL2 {NM , N, D , E , Z , IERR, MACHEP) 
implicit real*8 (a-h,o-z) 

REAL* 8 MACHEP 

REAL* 8 D (NM) , E (NM) , Z (NM, NM) 

C **+* MACHEP IS A MACHINE DEPENDENT PARAMETER SPECIFING 
C THE RELATIVE PRECISION OF FLOATING POINT ARITHEMETIC 

IQLO=ICOUNQ 
I ERR = 0 

C WRITE ( * , * ) ' *** TQL2 BEGIN ******** 

IF(N.EQ.l) GO TO 1001 
CDO 100 1=2, N 
C100 E ( I - 1 ) =E ( I ) 

E (N) =E <N-1) 

F=0 . OdO 
B=0 . 00 
E (N) =0 . OdO 
DO 240 L=1,N 
J=0 

H=MACHEP* (ABS (D (L) ) +ABS (E (L) } ) 

IF(B.LT.H) B=H 

c ******* l0 OK FOR SMALL SUB-SIAGONAL ELEMENT *********** 

DO 110 M=L, N 

IF (ABS (E (M) ) . LE . B) GOTO 120 
C**** E (N) ) IS ALWAYS ZERO, SO THERE IS NO EXIT THROUGH 
C THE BOTTOM OF THE LOOP ********************** 

110 CONTINUE 

120 IF(M.EQ.L) GO TO 220 

130 IF (J.EQ.30) GO TO 1000 

J=J+1 

C ******* FORM SHIFT ******** 

L1=L+1 
G=D (L) 

P= (D (LI) -G) / (2 . 0d0*E (L) ) 

R=DSQRT (P*P+1 . OdO) 

D (L) =E (L) / (P-t-SIGN (R, P) ) 

H=G-D(L) 

DO 140 I=L1,N 
140 D ( I ) =D ( I ) -H 

I COUNQ= I COUNQ + (N+l-Ll) 

F = F + H 

c ***** q L TRANSFORMATION ******* 

P=D (M) 

C=1 . OdO 
S=0. OdO 
MML=M-L 

c**** for I=M- 1 STEP -1 UNTIL L DO -- ******* 

DO 200 11=1, MML 
I=M- II 
G=C*E (I) 

H=C*P 

IF (ABS ( P) . LT . ABS (E (I) ) ) GO TO 150 
C=E ( I ) /P 

R=DSQRT (C*C+1 . OdO) 

E (1 + 1) =S*P*R 
S=C/R 
C=1 . 0/R 
GO TO 160 
150 C=P/E (I) 

R=DSQRT (C+C+l . OdO) 

E (1+1) =S*E (I) *R 
S=1 . OdO/R 
C=C*S 

160 P=C*D ( I ) - S*G 

D ( 1 + 1) =H+S* (C*G+S*D (I) ) 

C******* FORM VECTOR ********* 

DO 180 K=1,N 
H=Z (K, 1+1) 

Z (K, 1+1) =S*Z (K, I) +C*H 
Z (K, I) =C*Z (K, I) -S*H 
180 CONTINUE 

ICOUNQ=ICOUNQ+6*N 


SB 


c 

200 


CONTINUE 



c 


220 

240 

Q* * * 


c 


260 

C 


C 


280 

C 

300 

C 

C 

C 

1000 

1001 


G 


40 

30 

50 

20 


c 

C + + * 

c 


10 


20 


30 


I COUN Q= I COUNQ + 15* MML +13 
E (L) =S*P 
D(L) =C*P 

IF ( ABS (E { L) ) . GT . B) GO TO 130 

D (L) =D (L) +F 

CONTINUE 

ORDER EIGENVALUES AND EIGENVECTORS ********** 

DO 300 11=2, N 
1 = 11-1 
K=I 
P=D ( I ) 

DO 260 J= I I , N 
IF (D ( J) .LE. P) GO TO 260 
K=J 
P=D ( J) 

CONTINUE 

IF (K.EQ. I) GO TO 300 
D(K) =D(I) 

D { I ) =P 

DO 280 J=1,N 
P=Z (J, I) 

Z (J, I) =Z ( J, K) 

Z (J, K) =P 
CONTINUE 

CONTINUE 

GO TO 1001 

* SET ERROR --NO CONVERGENCE TO AN EIGENVALUE 
AFTER 30 ITERATIONS **************** 

IERR=L 

CONTINUE 

WRITE (23 ,*)' TQL2 END , IERR= ' , I ERR, ' ; at ',N, '-thLan step.' 

RETURN 

END 

SUBROUTINE VECTRA {N, NEIG, IBLOCK, VEC, Q, LANMAX , NJ, R) 
implicit real *8 (a-h,o-z) 

REAL * 8 Q (N, I B LOCK + LANMAX) ,VEC(lanmax, lanmax) ,R{1) 

write (23 ,*) 'VECTRA: lanmax , nj , neig = lanmax, nj , neig, ib lock 

DO 20 i = l,n 

do 30 j=l,neig 

R ( j ) =0 . OdO 

do 40 k=l,nj 

R ( j ) =R ( j ) +q(i, k) +vec (k, j ) 
continue 
do 50 kk=l,neig 
q ( i , kk) =R (kk) 

CONTINUE 

WRITE (23 ,*)' **+* VECTRA END , ilen,iseg= ++★★*' 

RETURN 

END 

*******************************+************************+****** 


FUNCTION GETEPS (IBET, IT, IRND) 
implicit real *8 (a-h,o-z) 
a = l.OdO 
a = a + a 

if { ( (a+1 . OdO) -a) -1 . OdO . eq. 0 . OdO) go to 10 

b=l . OdO 

b=b+b 

if ( (a+b) -a . eq . 0 . OdO ) go to 20 

ibeta=int (real ( (a+b) -a) ) 

beta=float (ibeta) 

it =0 

b=l. OdO 

it=it+l 

b=b*beta 

if ( ( (b+1 . OdO ) -b) - 1 . OdO . eq . 0 . OdO ) go to 30 
irnd =0 

betaml=beta- 1 . OdO 

if ( (a+betaml ) -a . ne . 0 . OdO) imd=l 

betain=l . OdO/beta 

a= 1 . 0 d 0 5*7 

do 40 i=l , it+3 
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a=a*betain 
40 continue 

50 if ( ( 1 . OdO+a) - 1 . OdO . ne . 0 . OdO) go to 60 

a=a*beta 
go to 50 
60 eps=a 

if { (ibeta . eq. 2 ) . or . (irnd. eq. 0) ) go to 70 
a= (a* (1 . OdO+a) ) / (1 . 0d0+l . OdO ) 
if( (l.OdO+a) -l.OdO.ne.O.OdO) eps=a 
70 geteps=eps 

return 
end 
c 

C 


real*8 function ddot (n , w, k , w2 , j ) 
implicit real*8 (a-h,o-z) 
real*8 w (1) ,w2 (1) 
dd=0 . OdO 
do 1 i = l,n 

1 dd=dd+w ( i ) *w2 ( i ) 

ddot=dd 
return 
end 
c 

£***★** + + *+ +★*★ + + ***+★★*•*•* + + *** + **★★*★*★+★********★********★★******★ 


************* Block Lanczos Driver ******************************** 

************************************************************************* 

SUBROUTINE SP2LAN(N, LANMAX, NEIG, UN, DI , IU, JU, W, wl , w2 , w4 , 

$TEM, EIG, ERR , VEC, ncoff , ncof2 , dmass, Q) 
implicit real*8 (a-h,o-z) 

REAL *8 W(l) , VEC ( 1 anmax , 1 anmax ) f EIG(l) , W4 (1) # W2 (1) , Q (N, LANMAX) 

REAL* 8 TEM(l) ,ERR{1) ,wX{l) ,dmass ( 1 ) , UN { 1 ) , DI ( 1 ) 

INTEGER IU ( 1 ) , JU ( 1 ) 

******★★ + *****★ + + + ****★** + **★* + ★*★*★ + ★ + ★** + + *★★★★*★** + **★★★★*★***★★★★★***•■*• 


subroutine blanmain (n, isize , iblock, neig, un, di , iu, ju, r , p, tem, dmass , 
1 beta , beta i , eig , vec , alf a , q, t , b , am, ia , ja, lump, ncoff , ishif t ) 

implicit real*8 (a-h,o-z) 

real *8 T ( isize , isize) , un (1) , di (1) , am (1) , eig { isize) 
real *8 dmass { 1 ) , q (n, isize + iblock) , r (n, iblock) , tern (n, iblock) 
real *8 alfa (iblock, iblock) ,beta (iblock, iblock) , vec (isize, isize) 
real *8 b(isize, isize) , betai (iblock, iblock) , p (n, iblock) 
integer ia (1) , ja (1) , iu (1) , ju (1) 


This is the driver for block Lanczos eigensolver ***** 

1. Initialization: 

do i = 1, isize 
do j = 1 , isize 
T ( j , i ) = 0 . OdO 

B ( j , i) = 0 . OdO 
enddo 
enddo 

if (lump . eq . 1) then 
rewind { 10 ) 

read(10) (dmass (i) ,i=l,n) 
write (23 , *) ' Lumped Mass is used ! ' 

else 

write (23 ,*) ' Consistant Mass is used 1 ' 
rewind { 10 ) 

read (10) (dmass (i) , i=l,n) 
rewind ( 17 ) 

read { 17 ) (am (i) , i=l, ncoff ) 
rewind (15) 

read (15) (ia(i) , i=l, 1+n) 
rewind { 11) 

read(ll) ( ja (i) , i=a, ncoff ) 

c stop 

endif 

C**** Do the shifting: K~ = K + ishif t*M 
lanmax=isize/ iblock 

write (23 , *) ' n, isize, iblock, ishift* 1 ,n, isize, iblock, ishift 

sr<? 
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do j = 1, iblock 
do i = 1 , n 

q(i, j ) = 0 . OdO 
r ( i , j ) = 0 . OdO 

p ( i , j ) = 0 . OdO 

enddo 
enddo 
C 

xmul=float ( ( (n+1) **3) /3) 
xmul=n 

xmul=dsqrt (3 . OdO ) / (xmul*dsqrt (xmul) ) 
c xmul=dsqrt (2 . OdO ) /f loat (n) 

c xmul = 1.0d0 ! just for the 6x6 ex. 

write (23 , *)' xmul = ' , xmul 
do i = 1 , n 

c r (i, 1) = xmul 

r(i,l) = f loat (i) *xmul 
if ( iblock . GE . 2 ) then 
if((i/2)*2.eq.i) then 
xmull=n-i+l 
else 

xmull=i-n- l 

endif 

r ( i , 2 ) = xmul 1 * xmul 
if { iblock . eq. 3 ) then 

r (i , 3 ) =1 . 0d-03 + sin ( 3 . 14 15 926 * i/4 ) 
endif 
endif 

c write ( *,*)' r {i , 1) = f ,i,r(i,l) 

enddo 

r(n,2)=1.0d0 l just for the 6x6 ex. 

call stepf (n, iblock, tem, q, r {1, 1) , beta, ia, j a, am, dmass, lump, 

11) 

call stepei (n, iblock, p {1, 1) ,q(l, 1) , ia, ja, am, dmass, lump) 

** Now, the main DO LOOP starts here: J = 1,2, 3 lanmax 

DO J = 1, lanmax 


j 0 1 = (J-l) *iblock+l 
j 02=J*iblock+l 
j 03= (J-2) *iblock+l 
do k = 1, iblock 

do i = 1, iblock 

betai{i,k) = beta{i,k) 
enddo 

enddo 


call stepa (n, iblock, r (1 , 1) ,p(l,l) ,iu,ju, di, un) 

if ( J. ne . 1) call stepb (n, iblock, r (1 , 1) , q ( 1 , J03) , betai) 

call stepc (n, iblock, p (1, 1) ,r(l,l) ,alfa) 


IF (J.EQ. LANMAX) GO TO 1000 


call stepd (n, iblock, q(l,J01) ,r(l,l) , alfa) 
do k = 1, iblock 

write (23,*) ; J, alfa (col . ) = ' ,J, (alfa(i,k) ,i=l, iblock) 
enddo 


C 

C 

1000 

C 

C 


call stepei (n, iblock, p ( l , l) , r (1 , 1) , ia , j a, am, dmass, lump) 

call stepf (n, iblock, tem, q, r (1, 1) , beta , ia , j a , am, dmass 
1 , lump,J+l) 

call stepei (n, iblock, p (1, 1) ,q(l, J02) , ia, ja , am, dmass , lump) 
continue 

call assemt (isize, iblock, T, alfa, beta, J) 

ENDDO 

call JACOBI (T, B , VEC, TEM (1,1) , TEM (1,2) , ISIZE , TOL, NSMAX, IFPR, 
1 ISHIFT) 


ss 
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write (23 , * )' ***** Max. Element in the K-th VEC vectors ******* 
do L = 1 , isize 
qmax=0 . OdO 
iqmax=0 

do i = 1, isize 

if (dabs (VEC (i , L) ) .gt . qmax) then 
qmax=dabs (VEC ( i , L) ) 
iqmax=i 
endif 
enddo 

write (23 , *)' K, ivmax, Vmax = ' , L # iqmax , qmax 
enddo 

write (23 ,*)' before VECTRA* Max. Element in the L-th Q vectors *' 

do L = 1, isize 

qmax=0 . OdO 

iqmax=0 

do i = 1 , n 

if (dabs (q ( i , L) ) . gt .qmax) then 
qmax=dabs (q ( i , L) ) 
iqmax=i 
endif 
enddo 

write (23, *) 'L, iqmax, Qmax = L, iqmax, qmax 
enddo 

call VECTRA { N , NE IG , IBLOCK, VEC, Q , isize, isize, R(l,l) ) 
do k ss l,neig 

write (*,*)' k, Q= ' ,K ; (q(i,k) , i=l, n) 
enddo 

call check (n, neig, isize, ncoff , ia, ja , un, di, q, p (1, 1) ,tem(l,l) , 

1 tem (1 , 2 ) , r (1 , 1) , am, dmass , lump, ishif t ) 


return 
end 

subroutine stepa (n, iblock, r , p, iu, ju, di , un) 
implicit real*8 (a-h,o-z) 

real *8 un (1) , di (1) , r (n, iblock) , p (n, iblock) 
integer iu (1) , ju (1) 

***** This subroutine finds: r that Kr = p in Step a. ******* 
do i = 1, iblock 

call fbe (n,iu,ju,di,un,p(l,i) ,r(l,i) , iopfb) 
enddo 
return 
end 

************************************************************** 


subroutine stepb (n, iblock, r , q, beta) 
implicit real*8 (a-h,o-z) 

real *8 r (n, iblock) , q (n, iblock) , beta (iblock, iblock) 

** This subroutine finds r (head) =r (bar ) - q*beta ( transpo . ) in Step b. **** 

** beta stored as upper- trangular matrix 

do i = 1, iblock 
do j = i, iblock 
do k = 1 , n 

r (k, i ) =r (k, i) - q(k f j) * beta(i,j) 
enddo 
enddo 
enddo 
return 
end 

********************************************************************* 

subroutine stepc (n, iblock, p , r, al fa) 
implicit real*8 (a-h,o-z) 

real*8 r (n, iblock) , p (n, iblock) , alfa (iblock, iblock) 
c **** This subroutine finds: alfa = P(transp.)*R in Step c. ****** 
do j = 1, iblock 
do i = 1 , j 

alfa (i, j ) =0 . OdO 
do k = 1 , n 

alf a ( i , j ) =alf a ( i , j ) +p(k,i) *r(k,j) 
enddo 

alfa ( j , i) = alfa (i, j ) 
enddo 
enddo 
return 
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end 


subroutine stepd (n, iblock, q, r , alfa) 
implicit real*8 (a-h,o-z) 

real *8 r (n, iblock) , q (n, iblock) , alfa ( iblock, iblock) 

*** This subroutine finds: R = R(head) - Q*alfa in Step d. ****** 
do j = 1, iblock 

do i = 1, iblock 
do k = 1 , n 

r (k, j j =r (k, j ) - q(k,i) * alfa(i,j) 
enddo 
enddo 

enddo 

return 

end 

********************************************************************* 


subroutine stepe (n, iblock, p, r , ia, ja, am, dmass, lump) 
implicit real*8 (a-h,o-z) 

real *8 r (n, iblock) , p <n # iblock) , am (1) , dmass (1) 
integer ia(l),ja(l) 

** + This subroutine finds: P(bar) = DMASS*R in Step e. 
if ( lump . eq . 1 ) then 

***** lumped Mass matrix ************ 
do j = 1, iblock 
do i = 1 , n 

p (i, j ) =dmass (i) * r(i,j) 

enddo 
enddo 


else 

write ( * , * ) ' e : dmass = ' , (dmass (i) , i = l,n) 
write (*,*)' e : am = ' , (am ( i) , i=l , 6 ) 
t***** consistant Mass matrix ************* 
do j = 1, iblock 

call multspa (n, ia, ja, am, dmass , r (1, j ) , p (1, j ) ) 
enddo 


endif 

return 

end 




******+************************ 


subroutine stepf (n, iblock, tem, q, r , beta, ia, ja, am, dmass , lump, JB) 
implicit real*8 (a-h,o-z) 
real *8 q{n, 1) ,am(l) , dmass (1) , r (n, iblock) 
real*8 beta (iblock, iblock) , tem (n, iblock) 
integer ia(l) , ja(l) 

************* This is for Steps f, g and h. 

************* This is similar to that of page 5 of 5 . 
************* iblock = block size; n = size of K & M; 
************* Looking for: q*beta = r, where only r is known 
************* and qMq = delta 


*************** 

*************** 


******* 


write {*,*)' in stepf: LUMP = ' , lump 
jO = ( JB-1) *iblock 

do i = 1, iblock 
do j = 1 , n 

tem ( j , i ) = 0 . OdO 

q ( j , i + jO) = r ( j , i) 
enddo 


do j = 1, iblock 

beta ( j , i) = 0 . OdO 
enddo 
enddo 

if (lump . eq. 1) then 

***** lumped Mass matrix *********** 
do j = 1, iblock 
do i = 1 , n 

tem { i , j ) = dmass ( i) * r(i,j) 
enddo 
enddo 


else 




consistant Mass matrix ************* 
do j = 1 , iblock 

call multspa (n, ia, ja, am, dmass , r (1, j ) , tem(l , j ) ) 


£{ 
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enddo 

endif 

**** now: tern = Mr already ! 

write (*,*)' stepf : finished tem=Mr I' 
do k = 1, iblock 

* 

do j =1, k-1 

********* 

do i = 1 , n 

beta ( j , k) =beta ( j , k) + q (i, j +j 0) *tem (i , k) 
enddo 

beta{k,j) = 0 . OdO 

write (23 ,*)' beta (j , k) = ',beta(j,k) 
do i = 1 , n 

write { * , *) ' q ( i , k) , q ( i , j ) = ' , q (i, k+j 0) , q (i , j+jO) 
q(i,k + j0)= q (i , k+j 0) - beta { j , k) *q ( i ( j + j 0) 
enddo 

********* 

enddo 


if (lump . eq. 1) then 

***** lumped Mass matrix ************* 
do i = 1 , n 

tem(i,k) = dmass(i) * q(i,k+j0) 
enddo 


else 

***** consistant Mass matrix ************* 

write (* , *) ' f :dmass= f , (dmass (i) , i=l, n) 
write (*,*)' f : am = r , (am ( i) , i=l , 6 ) 
write (*,*)' f :q = ' , (q (i , k+j 0) , i=l, 4) 
write (*,*)' f : tem= # , ( tem ( i , k) , i = l # 4 ) 

call mult spa (n, ia, ja, am, dmass, q(l,k+j0) , tem(l,k) ) 
write (* , *) ' f : tem=Mq= ' , (tem(i,k),i = l,4) 

endif 

beta(k,k) = 0 . OdO 

do i = 1 , n 

beta{k,k) = beta(k,k) + q (i, k+jO) *tem (i # k) 
enddo 

beta(k,k) = dsqrt (beta (k, k) ) 
xmult=l . OdO /beta (k, k) 

write (23 ,*)' beta (k, k) = f , k, beta (k, k) 
do i = 1 , n 

q(i f k+j0) = xmult * q(i,k+j0) 
enddo 

enddo 

write (*,*)' stepf : finished beta l' 

*** The following is for REORTHOGONALIZATION/normalization ****** 

******* jb i s the block # of the current block. ***** 

do m = ( JB-1) *iblock+l, JB*iblock 

if(n.gt.l) goto 1995 

*★★* DO: temp(i,l) = M*q(l,m) 


if ( lump . eq . 1} then 
do j = 1 , n 

tem ( j , 1) = dmass ( j ) * q(j,m) 
enddo 
else 

call multspa (n, ia, ja, am, dmass, q (1, m) , tem(l, 1) ) 
endif 

write (*,*)' stepf : finished tem=Mq I m * ',m 

do i = l,m-l 

tem(i,2) = 0 . OdO 
do j = 1 , n 

tem (i, 2) =tem (i, 2) + q ( j , i) *tem ( j , 1) 
enddo 
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c write (*,*)' in REOR: i,m # tem(i,2) = ' , i , m, tem ( i r 2 ) 

enddo 

C++++++ 

do i = l,m-l 
do j = 1 , n 

q( j ,m) =q(j ,m) - tem(i,2) * q(j,i) 
enddo 
enddo 

C++++++ 

1995 continue 

if (lump . eq. 1) then 
do j = 1 , n 

tem(j,l) = dmass(j) * q(j,m) 
enddo 
else 

call mult spa (n, ia, ja, am, dmass , q {1 , m) , tem (1 , 1) ) 
endif 

qnorm=0 . OdO 
do i = 1 , n 

qnorm=qnorm+q (i , m) *tem (i, 1) 

c write (*,*) ' i , q (i, m) , tem (i , 1)=' , i, q ( i, m) , tem (i , 1) 

enddo 

qnorm=l . OdO/dsqrt (qnorm) 

c write (*,*)' JB, m, qnorm = qnorm 

do i = 1 , n 

q(i,m) = q(i,m) * qnorm 
enddo 

enddo 


return 
end 

********************************************************************** 

subroutine stepei (n, iblock, p, q, ia, ja, am, dmass, lump) 
implicit real*8 (a-h,o-z) 

real *8 q (n, iblock) , p (n, iblock) , am (1) , dmass (1) 
integer ia(l),ja(l) 

**** This subroutine finds: P = DMASS*Q in Step e & i . ****** 

if (lump . eq. 1) then 

****** lumped Mass matrix ************ 
do j = 1, iblock 
do i = 1 , n 

p (i, j ) =dmass (i) * q(i,j) 
enddo 
enddo 


else 

****** consistant Mass matrix ************* 
write ( * , * ) ' ei : dmass = ' , (dmass (i) , i=l , n) 
write <*,*)' ei : am = ' , (am (i) , i = l, 6) 
do j = 1, iblock 

call multspa (n, ia, ja, am, dmass ,q(l,j),p(l,j)) 
enddo 


endif 

return 

end 




subroutine assemt ( isize , iblock , t , alf a , beta , JJ) 
implicit real*8 (a-h,o-z) 

real *8 T { isize , isize) , alf a (iblock, iblock) 
real*8 beta (iblock, iblock) 

***** This subroutine puts: alfa & beta into T matrix ******* 

JO = ( JJ-1) *iblock 

J1 = JJ* iblock 
if (J1 . LT . isize) THEN 
do J = 1, iblock 
do I = 1, iblock 
T ( JO + I , J0+ J) = alfa (I , J) 

T ( J0+ J, Jl+I ) = beta (I, J) 

write {*,*) 'jO+i,jO+j,T = ' , jO + i,jO + j ,T(jO+i,jO+j) 
write (* , *) ' j 0+j , jl+i , T = 9 , j 0+j , j l+i, T ( j 0+j , j 1+i) 
enddo 


63 



no lh o oonononoooo on on 


******** 


enddo 
ELSE 

C********** if ( Ji.GE.the size of T) do not put beta in !!!! 
do J = 1 , iblock 

do I = 1, iblock 
T ( JO + I , JO+vJ) = alfa (I, J) 
enddo 
enddo 
ENDIF 
C 

return 
end 

***★******★******★***************************** + ****■*.*****.*.***.*.** 


90 


* * * * * 


subroutine check (n,neig, isize,ncoff , ia , j a , an, ad, q, hh, eig, 
V, VP1 , am, dmass , lump , ishif t ) 
implicit real*8 (a-h,o-z) 

real *8 hh ( 1) # q (n, 1) , V(l) # VP1(1) , EIG(l) ,an(l) ,ad(l) 
real *8 am (1) # dmass (1) 
integer ia(l) ; ja(l) 

Error Norm Check: | |Ax-wMx| |/| | Ax | | ********* 

call cputime (tOl) 
rewind ( 15 ) 

read(15) ( ia ( i ) ,i = l,l+n) 
rewind ( 11) 

read(ll) ( j a ( i) , i = l , ncof f ) 
rewind (12) 

read(12) (an(i) ,i=l,ncoff) 
rewind ( 13 ) 

read(13) (ad(i) ,i=l,n) 

write (*,*)' lump , n, isize , ncof f = ', lump, n, isize, ncof f 
write ( * , 
write ( * , 
write ( * , 
write ( * , 
write ( * , 
write ( * , 
do k = 1 
write { * , * ) 
enddo 

write (23, * ) 

DO 580 L=1 
qmax=0 . OdO 
iqmax=0 
do i = 1 , n 

i f ( dabs ( q ( i , L ) ) . gt . qmax ) 
qmax=dabs ( q ( i , L ) ) 
iqmax=i 
endif 
enddo 

write (23 ,*)' L, iqmax, Qmax = ' , L , iqmax, qmax 
call mult spa (n, ia, ja, an, ad, q ( 1 , L) , v) 

VNORM=0 . OdO 
DO 590 1=1, N 
VNORM=VNORM+V ( I ) * V { I ) 

WNORM=0 . OdO 
if (lump . eq . 1) then 
Lumped Mass matrix 


* ) ' check : 

eig = 

' , (eig (i) , i = l,neig) 

* ) 7 check : 

ia = 1 

, (ia (i) , i=l , n+1) 

* ) 1 check : 

ja = ' 

, ( ja (i) , i = l , ncof f ) 

* ) ' check : 

ad = ' 

, (ad (i ) , i=l, n) 

+ ) ' check : 

an = ' 

, (an (i) , i=l , ncof f ) 

* ) ' check : 

dmass 

= ' , (dmass { i) , i=l, n 

# neig 
* ) 1 check : 

k,Q = 

' , (q(i,k) , i=l,n) 


' Check : 
, neig 


***★ Max. Element in the L-th Q vectors ****' 


then 


************** 


do 2239 i=l , n 
vpl (I) =dmass (i) *Q (i, L) 
2239 continue 
C 


Q* * * 


* * * 


else 

consistant Mass matrix ************* 
call multspa (n, ia, ja, am, dmass , q (1, L) ,VP1) 


C 


endif 


C 


600 

601 


c 


RT=1 . OdO/EIG (L) - float (ishift) 

DO 600 1=1, N 
V(I) =V(I) -RT+VP1 (I) 
continue 
do 601 i = 1 , n 
WNORM=WNORM+V ( I ) * V ( I ) 

VNORM=DSQRT (VNORM) 

WNORM=DSQRT { WNORM) 

write (23 , *)' L, wnorm,Vnorm = ' , L, wnorm, vnorm 
HH (L) =WNORM/ VNORM 

if {dabs (RT) .It . 1 . 0D-14) HH (L) = DABS (RT) 
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CONTINUE 

call cputime (t02) 
t02=t 02 - tOl 

WRITE (23 , *) ' *** K, * EIG*,*HERTZ *,* ERROR NORM *** ' 

DO 700 K=1 , NEIG 

HERTZ = 1 . 0/ (2.0*3.1415927) *DSQRT (max (0 . 0D0 , 1 . ODO/EIG (K) -ishift) ) 
WRITE (2 3,701) K, max (0 , 0D0 , 1 . ODO/EIG (K) -ishift) , HERTZ , HH (K) 

FORMAT (2X, 15, 2X, 4E15 . 7) 

CONTINUE 
return 
end 

★ + * * ★ * ★***★ * ★ + ★ ★ -*■ * * ★★ ★ ★ ★ + ★ * *****•★★★ + + + ★*★+** * * ★ ****** ★ ★ * ★ * * * 

C 

C SUBROUTINE JACOBI (A, B , X , EIGV, D, N, RTCL, NSMAX, IFPR, IOUT) 

c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
C 
c 
c 
c 
c 
c 
c 
c 
c 
c 

SUBROUTINE JACOB I (A, B, X, EIGV, D , N, RTCL , NSMAX , IFPR, ISHIFT) 
c Parameter (RTOL=10 . **-12 , NSMAX=15 ,N=2 , IFPR=0 , IOUT=6) 

IMPLICIT REAL* 8 (A-H,0-Z) 

REAL* 8 A(N,N) ,B(N,N) ,X(N,N) ,EIGV(N) , D(N) , BB 
NSMAX = 15 
IOUT = 6 
IFPR = 0 
RTOL = 1.0D-14 
c 
c 
c 
c 
c 
c 


This program is used in single precision arithmetic on 
cdc equipement and double precision arithmetic on ibm 
or univac machines, activate , deactivate or adjust above 
card for single or double precision arithmetic 


PROGRAM TO SOLVE THE GENERALIZED EIGENPROBLEM 
USING THE GENERALIZED JACOBI ITERATION 

-- INPUT VARIABLES -- 

A(N,N) =Stiffness matrix (assumed positive definite) 

B (N, N) =Mass matrix (assumed positive definite ) 

X (N, N) =Vector storing eigenvectors on solution exit 
EIGV (N) =Vector storing eigenvalues on solution exit 
D (N) ^Working vector 

N =order of matrices A and B 

RTOL =Convergence tolerance (usually set to 10. **-12 

NSMAX =Maximum number of sweeps allowed 

(usually set to 15) 

IFPR =flag for printing during iteration 

EQ.O No printing 

EQ.l Intermeddiate results are printed 
IOUT =output device number 

ISHIFT : K~=K+ishift*M 


-- OUTPUT 

A (N, N) 
B(N,N) 

X (N, N) 
EIGV (N) 


=Diagonalized stiffness- matrix 
=Diagonalized mass matrix 
=EIGENVECTors stored columnwise 
=Eigenvalues 


580 

c 


701 

700 


C INITIALIZE EIGENVALUE AND EIGENVECTOR MATRICES 

C 

do i = 1 , n 
do j = 1 , n 
b ( i , j ) = 0 . OdO 

enddo 

do j = 1 / i - 1 
a ( i , j ) = a ( j , i ) 

enddo 

b (i, i) = 1 . OdO 
enddo 

DO 10 1=1 , N 

IF (A(I, I) .GT. 0 .DO .AND. B ( I , I ) . GT . 0 . DO ) GO TO 4 
WRITE (IOUT, 2020) 

STOP 

4 D ( I ) =A ( I , I) /B (1,1) 

10 EIGV ( I ) =D ( I ) 

DO 30 1=1, N 

DO 20 J=1 , N 

X ( I , J) = 0 .do 


20 


QS 



IF (N.EQ.l) RETURN 


30 X ( I , I) =1. 0D0 
cccccccccccccccccccccccccccc 
IF (N.EQ.l) STOP 
c 

c INITIAALIZE SWEEP COUNTER AND BEGIN ITERATION 

c 

NSWEEP=0 
NR=N- 1 

40 NSWEEP=NSWEEP+1 

IF (IFPR.EQ.l) WRITE ( IOUT, 2000 ) NSWEEP 

c 

c CHECK IF PRESENT OFF-DIAGONAL ELEMENT IS LARGE ENOUGH TO REQUIRE 

c ZEROING 

c 

EPS= ( . 01**NSWEEP) * *2 
DO 210 J=1 , NR 
JJ=J+1 

DO 210 K=JJ, N 

EPTOLA= (A(J,K) *A(J, K) ) / (A(J, J) *A(K, K) ) 

EPTOLB= (B (J, K) *B(J,K) )/ (B(J,J) *B(K,K) ) 

IF (( EPTOLA.LT.EPS) .AND. (EPTOLB.LT.EPS) ) GO TO 210 
AKK=A (K, K) *B ( J, K) -B (K, K) *A(J,K) 

AJJ=A ( J, J) *B ( J, K) -B ( J, J) *A(J,K) 

AB=A ( J, J) *B(K,K) -A (K, K) *B ( J, J) 

CHECK= (AB*AB+4 . 0d0*AKK*AJJ) /4 . OdO 
IF (CHECK) 50,60,60 
50 WRITE (IOUT, 2020) 

STOP 

60 SQCH=DSQRT (CHECK) 

Dl=AB/2 . 0D0+SQCH 
D2 =AB/ 2 . ODO - SQCH 
DEN=D1 

IF (DABS (D2) .GT. DABS (Dl) ) DEN=D2 
IF (DEN) 80,70,80 
70 CA=0 . 

CG= - A ( J, K) /A (K, K) 

GO TO 90 
80 CA=AKK/DEN 
CG=-AJJ/DEN 
c 

c PERFORM THEGENERAL ROTATION TO ZERO THE PRESENT OFF DIAGGONAL ELEMENT 

c 

90 IF(N-2) 100, 190, 100 
100 JP1=J+1 
JM1=J-1 
KP1=K+1 
KM1=K- 1 

IF(JMl-l) 130, 110,110 
110 DO 120 1=1, JM1 
AJ=A (I , J) 

BJ=B ( I , J) 

AK=A (I , K) 

BK=B ( I , K) 

A (I, J) =AJ+CG*AK 
B ( I , J) =BJ+CG*BK 
A (I, K) =AK+CA*AJ 
120 B ( I , K) =BK+CA*BJ 
130 IF(KPl-N) 140, 140, 160 
140 DO 150 I=KP1 , N 
AJ=A ( J, I) 

BJ=B(J, I) 

AK=A (K, I) 

BK=B (K, I) 

A ( J, I) =AJ+CG*AK 
B(J, I) =BJ+CG*BK 
A (K, I) =AK+CA*AJ 
150 B (K, I) =BK-t-CA*BJ 
160 IF (JP1-KM1) 170, 170, 190 
170 DO 180 1= JP1 , KM1 
AJ=A(J, I) 

BJ=B (J, I) 

AK=A ( I , K) 

BK=B ( I , K) 

A ( J, I) =AJ+CG*AK 
B (J, I) =BJ+CG*BK 
A ( I , K) =AK+CA*AJ 
180 B (I, K) =BK+CA*BJ 
190 AK=A (K, K) 

BK=B (K, K) 

A(K, K) =AK+2 . 0D0*CA*A ( J, K) +CA*CA*A ( J, J) 



noooo non non odd non 


c 

c 

c 


B (K, K) =BK+2 . ODO *CA*B ( J, K) +CA*CA*B ( J, J) 

A(J, J) =A(J, J) +2 . ODO*CG*A{J, K) +CG*CG*AK 
B ( J, J) = B ( J, J) +2 . ODO *CG*B ( J, K) +CG+CG*BK 
A(J / K)=O.ODO 
B ( J, K) =0 . ODO 

UPDATE THE EIGENVECTOR MATRIX AFTER EACH ROTATION 

DO 200 1=1, N 
XJ=X { I , J) 

XK=X(I,K) 

X ( I , J) =XJ+CG*XK 
200 X { I , K) =XK+CA*XJ 
210 CONTINUE 

UPDATE THE EIGENVALUE AFTER EACH SWEEP 
DO 220 1=1, N 

IF (A (I, I) .GT . 0 . ODO . AND.B (I, I) . GT . 0 . ODO) GO TO 220 
WRITE (IOUT, 2020) 

STOP 

220 EIGV(I) =A(I, I) /B (I, I) 

IF (IFPR. EQ . 0) GO TO 230 
WRITE (* , 2030) 

WRITE (*, 2 010) {EIGV { I ) ,I = 1,N) 

CHECK FOR CONVERGENCE 

230 DO 240 1=1, N 
TOL=RTOL*D { I ) 

DIF=DABS {EIGV (I) -D(I) ) 

IF {DIF . GT . TOL) GO TO 280 
240 CONTINUE 


CHECK ALL OFF-DIAGONAL ELEMENTA TO SEE IF ANOTHER SWEEP IS REQUIRED 

EPS=RTOL*+2 
DO 250 J=1 , NR 
JJ=J+1 

DO 250 K=JJ,N 

EPSA= (A { J, K) *A ( J, K) ) / (A( J, J) +A(K,K) ) 

EPSB= (B { J, K) *B (J,K) ) / (B(J, J) +B(K,K) ) 

IF {{EPSA.LT.EPS) .AND. { EPSB . LT . EPS ) ) GO TO 250 
GO TO 280 
250 CONTINUE 

FILL OUT BOTTOM TRIANGLE OF RESULTANT MATRICES AND SCALE EIGENVECTORS 

255 DO 260 1=1, N 
DO 260 J=1 , N 
A { J, I ) =A ( I , J) 

260 B (J, I) =B (I, J) 

DO 270 J= 1 , N 
BB=DSQRT(B(J, J) ) 

DO 270 K= 1 , N 
270 X {K, J) =X(K, J) /BB 
CCCCCCCCCCCCCCCC RETURN 


write (*,*) 'THE EIGENVECTORS ARE 
write (*,*) ( (x(i,j) , j =1 , n) ,i = l,n) 

** Reorder the eigenvalues by decreasing order, by Jiangning QIn, Sep. 1995 
do I = 1,N 

do J = 1+1 , N 

if (eigv (j ). gt . eigv (i) ) then 
bb = eigv{j) 
eigv ( j ) =eigv ( i) 
eigv(i) = bb 
do k = l,n 
bb=X {k, j ) 
x {k, j ) =x (k, i) 
x (k, i) =bb 
enddo 
endif 
enddo 
enddo 

write (23 , *) 'THE EIGENVALUES in JACOBI ARE 
do i = 1 , n 
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' , EIGV ( I ) , ' 


** * 


c 

c 


WRITE (23,*) ' * * I ** 

enddo 

return 

STOP 


, i / 


c UPDATE D MATRIX AMD START NEW SWEEP, IF ALLOWED 

c 

280 DO 290 1 = 1 f N 
290 D(I) =EIGV(I) 

IF (NSWEEP . LT . NSMAX) GO TO 40 
GO TO 255 

2 00 0 FORMAT ( 2 7HOSWEEP NUMBER IN * JACOBI* = ,14) 

2010 FORMAT (1H0,6E2 0.12) 

2020 FORMAT (2 5HO***ERROR SOLUTION STOP / 

1 IX, 3 OH MATRICES NOT POSITIVE DEFINITE ) 

2030 FORMAT (3 6HOCURRENT EIGENVALUES IN * JACOBI* ARE,/) 
END 



SUBROUTINE JACOBI 2 (A, B , X, EIGV, N, D , rtol , nsmax) 

modified by runesha Nov 27, 1995 


c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 

c Parameter (RTOL=10 . ** - 12 , NSMAX=15 , IFPR=0, I0UT=6) 

c Parameter (RTOL=0 . 000000000001 , NSMAX=15 # IFPR=0, IOUT=6) 

IMPLICIT REAL* 8 (A-H,0-Z) 

DIMENSION A (N, 1 ) , B (N, 1) , X (N, 1) ,EIGV(1) ,D(1) 


+ 

c 

c 

c 

c 

c 

c 


ABS (X) =DABS (X) 
SQRT(X) =DSQRT (X) 


This program is used in single precision arithmetic on 
cdc equipement and double precision arithmetic on ibm 
or uni vac machines, activate, deactivate or adjust above 
card for single or double precision arithmetic 


PROGRAM TO SOLVE THE GENERALIZED EIGENPROBLEM 
USING THE GENERALIZED JACOBI ITERATION 

-- INPUT VARIABLES -- 


A (N, N) 

B (N, N) 

X (N f N) 
EIGV (N) 
D (N) 

N 

RTOL 

NSMAX 


-- OUTPUT -- 


=Stiffness matrix (assumed positive definite) 
=Mass matrix (assumed positive definite ) 
=Vector storing eigenvectors on solution exit 
=Vector storing eigenvalues on solution exit 
=Working vector 
=order of matrices A and B 

=Converg ence tolerance (usually set to 10. **-12 
=Maximum number of sweeps allowed 
(usually set to 15) 


A (N, N) ^Diagonalized stiffness matrix 

B (N, N) ^Diagonalized mass matrix 

X (N , N) =Eigenvectors stored columnwise 

EIGV (N) =Eigenvalues 


c 

c 

c 


c 

c 

c 

c 


c 

c 

c 

c 

c 


INITIALIZE EIGENVALUE AND EIGENVECTOR MATRICES 
DO 10 1*1, N 

IF (A(I, I) .GT.O . .AND. B ( I , I ) . GT . 0 . ) GO TO 4 
WRITE (*,2020) 

STOP 

4 D { I ) = A (I,I)/B(I,I) 

10 EIGV (I) =D (I) 

DO 30 1=1, N 

DO 20 J=1 , N 
20 X { I , J) =0 . dO 
30 X (I, I) =1 . 

. . . IF (N.EQ. 1) RETURN 
IF (N.EQ.l) STOP 

INITIAALIZE SWEEP COUNTER AND BEGIN ITERATION 

NSWEEP= 0 
NR=N- 1 

40 NSWEEP=NSWEEP+1 

IF (IFPR.EQ.l) WRITE ( * , 2000 ) NSWEEP 

CHECK IF PRESENT OFF-DIAGONAL ELEMENT IS LARGE ENOUGH TO REQUIRE 
ZEROING 

EPS= ( . 01* *NSWEEP) * *2 
DO 210 J=1 , NR 
JJ=J+1 

DO 210 K=JJ,N 

EPTOLA= ( A ( J , K) *A(J,K) ) / (A ( J, J) *A(K,K) ) 

EPTOLB= (B ( J, K) *B ( J, K) ) / (B ( J, J) *B (K, K) ) 

IF (( EPTOLA.LT.EPS) .AND. (EPTOLB.LT.EPS) ) GO TO 210 
AKK=A(K,K) *B ( J, K) -B (K, K) *A(J,K) 

AJJ=A ( J, J) *B ( J, K) -B { J, J) *A ( J, K) 

AB=A( J, J) *B(K,K) -A (K, K) *B ( J, J) 
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CHECK= (AB*AB+4 . *AKK*AJJ) /4 . 

IF (CHECK) 50, 60, 60 
50 WRITE(*, 2020) 

STOP 

60 SQCH=DSQRT (CHECK) 

Dl=AB/2 . +SQCH 
D2=AB/2 . -SQCH 
DEN=D1 

IF (DABS (D2) .GT. DABS (Dl) ) DEN=D2 
IF (DEN) 80,70, 80 
70 CA=0 . 

CG= - A ( J, K) /A(K,K) 

GO TO 90 
80 CA=AKK/DEN 
CG= - A J J / DEN 
c 

c PERFORM THEGENERAL ROTATION TO ZERO THE PRESENT OFF DIAGGONAL ELEMENT 

c 

90 IF(N-2) 100, 190, 100 
100 JP1=J+1 
JM1= J- 1 
KP1=K+1 
KM1=K- 1 

IF (JM1-1) 130,110,110 
110 DO 120 1=1, JM1 
AJ=A ( I , J) 

BJ=B (I, J) 

AK=A ( I , K) 

BK=B (I, K) 

A ( I , J) =AJ+CG*AK 
B (I, J) =BJ +CG*BK 
A ( I , K) =AK+CA*AJ 
120 B ( I , K) =BK+CA*BJ 
130 IF(KPl-N) 140, 140,160 
140 DO 150 I=KP1 , N 
AJ=A ( J, I) 

BJ=B (J, I) 

AK=A (K, I) 

BK=B (K, I) 

A(J, I) =AJ+CG*AK 
B (J, I) =BJ+CG*BK 
A (K, I) =AK+CA*AJ 
150 B(K, I) =BK+CA*BJ 
160 IF (JP1-KM1) 170, 170, 190 
170 DO 180 I=JP1 , KM1 
AJ=A(J, I) 

BJ=B ( J, I) 

AK=A (I , K) 

BK=B (I , K) 

A ( J, I ) =AJ+CG*AK 
B (J, I) =BJ+CG*BK 
A ( I , K) =AK+CA*AJ 
180 B ( I , K) =BK+CA*BJ 
190 AK=A (K, K) 

BK=B (K, K) 

A (K, K) =AK+2 . *CA*A ( J, K) +CA*CA*A ( J, J) 

B (K, K) =BK+2 . *CA*B ( J, K) +CA*CA*B ( J, J) 

A ( J, J) =A ( J, J) +2 . *CG*A( J, K) +CG*CG*AK 
B ( J, J) =B ( J, J) +2 . *CG*B ( J, K) +CG*CG*BK 
A( J, K) =0 . dO 
B ( J, K) =0 . dO 
c 

c UPDATE THE EIGENVECTOR MATRIX AFTER EACH ROTATION 

c 

DO 200 1=1, N 
XJ=X (I, J) 

XK=X (I, K) 

X (I , J) =XJ+CG*XK 
200 X (I , K) =XK+CA*XJ 
210 CONTINUE 
c 

c UPDATE THE EIGENVALUE AFTER EACH SWEEP 

C 

DO 220 1=1, N 

IF ( A ( I , I ) . GT . 0 . ,AND.B(I,I) .GT.OJGO TO 220 
WRITE (*,2020) 

STOP 

220 EIGV(I) =A(I, I) /B(I, I) 
c IF (IFPR.EQ. 0) GO TO 230 

c WRITE (IOUT, 2030) _ 

7o 



c 

c 


WRITE (IOUT, 2010} (EIGV(I) , 1=1, N) 
CHECK FOR CONVERGENCE 


c 

230 DO 240 1=1, N 
TOL=RTOL*D { I ) 

DIF=DABS (EIGV(I) -D(I) ) 

IF (DIF . GT . TOL) GO TO 280 
240 CONTINUE 


c 

c CHECK ALL OFF-DIAGONAL ELEMENTA TO SEE IF ANOTHER SWEEP IS REQUIRED 

C 

EPS = RTOL* * 2 
DO 250 J=1,NR 
JJ=J+1 

DO 250 K= JJ , N 

EPSA= (A ( J, K) *A ( J, K) ) / {A ( J, J) +A (K # K) ) 

EPSB= (B ( J, K) *B ( J, K) ) / (B(J, J) +B(K,K) ) 

IF ((EPSA.LT.EPS) .AND. (EPSB . LT . EPS) ) GO TO 250 
GO TO 280 
250 CONTINUE 
c 

C FILL OUT BOTTOM TRIANGLE OF RESULTANT MATRICES AND SCALE EIGENVECTORS 

C 

255 DO 260 1 = 1, N 
DO 260 J=1 , N 
A(J, I) =A ( I , J) 

260 B ( J, I ) =B (I , J) 

DO 270 J=1 , N 
BB=DSQRT (B ( J, J) ) 

DO 270 K=1 , N 
270 X (K, J) =X (K, J) /BB 


* write (*,*) * THE EIGENVECTORS ARE : ' 

* write(*,*} ( (x(i, j) , j=l,n) ,i=l,n) 
return 

c 

c 

c UPDATE D MATRIX AMD START NEW SWEEP, IF ALLOWED 

c 

280 DO 290 1=1, N 
290 D ( I ) =EIGV(I) 

IF (NSWEEP . LT . NSMAX) GO TO 40 
GO TO 255 

2000 FORMAT (2 7HOSWEEP NUMBER IN *JACOBI* = ,14) 

2010 FORMAT (1HO, 6E20. 12) 

2 02 0 FORMAT ( 2 5H0* **ERROR SOLUTION STOP / 

1 3 OH MATRICES NOT POSITIVE DEFINITE ) 

2030 FORMAT { 3 6HOCURRENT EIGENVALUES IN *JACOBI* ARE,/) 
END 


subroutine matmat3 {a , b , n, iq, temp) 
c xx (n*iq) =xx (n*iq) *phi (iq, iq) 

c a = a * b 

implicit real*8 (a-h, o- z) 
dimension a ( 1) , b (iq, 1) , temp (1) 

iqq=iq-mod (iq, 8 ) 
do 10 i=l,n 
do 2 0 j =1 , iq 

20 temp(j) = a((j-l)*n+i) 
do 30 k=l , iq 
sum=0 . do 
do 1=1 , iqq, 8 

sum=sum + temp (1) *b (1, k) +temp (1+1) *b (1 + 1, k) 

+ + temp (1+2 ) *b (1+2 , k) +temp (1+3 ) *b (1+3 , k) +temp (1+4) *b (1+4 , k) 
+ + temp (l+5)*b(l+5,k) +temp (1+6) *b (1+6 , k) + temp (1+7) *b (1+7, k) 
enddo 

do 40 l=iqq+l,iq 
40 sum = sum + temp (1) *b (1, k) 

30 a ( (k-1) *n+i) =sum 
10 continue 


return 

end 
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c This version does not have any loop enrolling 

subroutine multspa2 (n, ip, indx r coef , diag, rhs , temp) 
c subroutine multsp (n, ia, ja, an, ad, b, c , isupd, iptrs) 

implicit real*8 (a-h,o-z) 
real et (2 ) 

dimension rhs (1) , coef (1) , indx (1) 
dimension diag (1) , temp (1) , ip (l) 
ncoef = ip (n+1) -1 
c ettl=etime (et ) 

ttl=et (1) 
do i = 1 , n 

temp (i) =diag (i) *rhs (i) 
enddo 

do 10 i = 1 , n 
if irst=ip (i) 
ilast=ip ( i+1) -1 
suml=temp ( i ) 

C DIR$ IVDEP 

do k=if irst , ilast 
kk= indx (k) 

suml=suml+coef (k) *rhs (kk) 

temp (kk) =temp (kk) +coef (k) *rhs (i) 

enddo 

temp ( i) =suml 
10 continue 

c ett2=etime (et ) 

c tt2=et ( 1) 

c time2= ( tt2 - ttl) 

c noper= ncoef*4+n 

c write (*,*) 'Time in Matrix-by-Vector = ' , time2 

return 
end 


subroutine normcheck ( iq, n, neig, ncoef , ia, j a , an, ad, dm, am, eigv, 
+ xx, v, vpl , lump, err , ishif t) 

implicit real*8 (a-h,o-z) 
real*8 an(l) ,ad(l) , eigv(l) ,xx(l) ,dm(l) 

+ , err (1) , v (1) , vpl (1) , am (1) 

integer ia ( 1 ) , j a ( 1 ) 
real t01,t02,t03 

c Purpose : Calculate | | Kx-eigv.Mx | | / | | K.x| | 

call cputime (tOl) 


c 

c 

c 

c 

c 

c 

c 

c 


rewind (15) 

read(l5) ( ia ( i) , i=l , 1+n) 
rewind ( 11 ) 

read (11) { ja (i) , i=l, ncoef ) 

rewind (12 ) 

read (12) (an(i) , i=l, ncoef) 
rewind (13 ) 

read(13) (ad (i) , i=l , n) 


do 100 k=l,neig 

call multspa2 (n, ia, ja, an, ad, xx ( (k-1) *n+l) , v) 
VNORM=0 . OdO 
do 110 i=l,n 

110 VNORM= VNORM+V { I ) *V(I) 

WNORM=0 . OdO 
if (lump . eq. 1) then 

C* + ★ * ★ Lumped Mass matrix * + **★*** + + + ** + 

do 120 i=l,n 

vpl (I) =dm(i) *xx( (k-1) *n+i) 

120 continue 


else 

C****** consistant Mass matrix *+★★++***+*** 

call multspa2 (n, ia, ja,am,dm,xx( (k-1) *n+l) ,VP1) 
endif 


RT= eigv(k) 

DO 600 1 = 1, N 
V(I) =V(I) -rt*VPl (I) 

600 continue 

do 601 i = 1 , n 

601 WNORM=WNORM+V(I) +V(I) 

VNORM=DSQRT (VNORM) 

WNORM=DSQRT (WNORM) 
err (k) = WNORM /VNORM 

100 CONTINUE 
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u u 


call cputime (t02) 
tO 3 = t02 - tOl 

write (2 3,*) ' Time normcheck = ',t03 

WRITE (23 , *) 

'***** # ++,*+* EIGV *** , *** HERTZ *** , ** ERROR NORM ** ' 

if ( ishif t . ne . 0) then 
do i=l,neig 

eigv (i) = eigv (i) -float ( ishif t) 
enddo 
endif 


DO 900 K=1,NEIG 

c hertz =dsqrt (eigv (k ) ) / (2.0*3.1415927) 

hertz =dsgrt (dabs (eigv (k) ) ) / (2 . 0*3. 1415927) 
WRITE (23 , 901) K, EIGV(K) , HERTZ , err (K) 

901 FORMAT (2X, 15 , 2X, 4E15 . 7) 

900 CONTINUE 

return 
end 


subroutine spasubspace (iq # n, ncoef , neig, lumas, mtot, ik r jk,dk 
+ , ak, dm, am, iu, ju, di , un, xkk, xmm, phi , eigv, eigvO, tol j 

+ , temp, xx, ishif t) 

c developped by runesha October 20,1995 

implicit real * 8 (a-h , o- z ) 

real *8 ak ( 1) , dk (1) , dm (1) , am (1) , un (1) , di (1) ,xx (1) , xkk (iq, 1) , 

+ xmm (iq, 1} , temp (1) , eigv (1) , eigvO (1) , tol j (1) ,phi (iq, 1) 

integer ik(l) , jk(l) ,iu(l) , ju(l) , itertot 
c real tl , t2 , t 3 , t4 , t5 , t6 , t7 

real tl,t7 



c Purpose : This subroutine computes NEIG eigenvalues and eigenvectors 

c of an NxN matrix 

c This subroutine is called by the main program 

c 

c INPUT : - ik (n+1 ) , j k (ncoef ) , ak (ncoef ) , dk (n) : given upper triangular 

c stiffness matrix in RR(U)U 

c -ik (n+1) , jk (ncoef ), am(ncoef) , dm (n) : given upper triangular 

c mass matrix in RR(U)U 

c -NEIG : number of desired eigenvectors and eigenvalues 

c -NCOEF : number of upper triangular non zero coefficients 

c diagonal values not included, 

c -LUMAS : = 0 Case of lumped mass 

c =1 case of consistent mass 

c OUTPUT : 

c Working space: 

c 


c 


c 


c 


c 


c 

c 

c 


c 


call cputime (tl) 

To limit the number of iterations 

itertot* 100 

ICONV=0 

toler=0 .000000000001 

nsmax=12 

Initialization 

do i=l,iq 

eigvO ( i) =0 . 

enddo 

call cputime (t2) 


write (23 , *) 
write (23 , *) 
write (23 , * ) 
write (23 , * ) 
write (23, * ) 
write (23 , * ) 
write (23 , * ) 
write (23 , * ) 
write (23 , * ) 


> * * ' 

' * OUTPUT VECTOR-SPARSE SUBSPACE ITERATION *' 

/ + * ' 

/ +*****★ + ***★★*★*★ + + ***★★'*•**** + *★★★************** * 

' NEQ = ' , n 
' NCOEF= ncoef 
' LUMAS = ' , lumas 
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c write (23,*) 'NEIG = ',neig 

write (23, * ) 'IQ = ' , iq 

c write (23,*) 'MAX NUMBER OF ITER ALLOWED =',itertot 

c write (23,*) 'TOLER IN JACOBI =',toler 

memory=2*ncoef +2*ncoef 2+7*n +3*iq*iq +3*iq+n*iq 
c write (23,*) 'Total memory = memory 

c call cputime ( t 3 ) 

C . . . . STARTING ITERATION VECTORS . 

c.. The following is the basic just 1 on the diag not even Kii/Mii 

cp ii=n*iq 

cp do i=l , ii 

cp xx ( i) =0 . do 

cp enddo 

cp do i=l , iq 

cp xx ( i+n* (i-1) ) =1 . do 

cp enddo 

c KJ Bathe style starting vector Iteration 

nd=n/iq 

c if ( lumas . ne . 1 ) go to 4 

j=0 

do 6 i=l,n 
xx (i) =dm(i) 
if (dm (i) . gt . 0) j = j +1 
6 temp ( i) =dm { i) /dk (i) 

if(iq.le.j) go to 16 

write (*,*) ' IQ CANNOT BE LARGER THAN 3 DIAG NON ZERO ' 

stop 

c 4 do 11 i=l,n4 

c xx ( i) =dm (i) 

c 11 temp ( i) =dm (i) /dk (i) 

16 do 20 i=n+l,n*iq 
20 xx ( i) =0 

l=n-nd 

do 30 j =2 , iq 
rt=0 

do 40 i=l , 1 

if (temp (i) .It.rt) go to 40 
rt=temp ( i) 

i j -i 

40 continue 

do 50 i = 1 , n 

if ( temp ( i) . le . rt) go to 50 

rt=temp { i) 

ij=i 

50 continue 

tol j ( j ) afloat (ij ) 
temo ( i j ) =0 . 

1=1 -nd 

30 xx ( ( j - 1) *n+ij ) =1 . 

c write (*,*) 'Degrees of freedom excited by unit starting 

c + iteration vector are:' 
c write (*,*) ( tol j { j ) , j =2 , iq) 

c A random vector is added to the last vector 

pi=3 . 141592654 
xxx=0 . 5 
iloc= (iq-1) *n 
do 60 k=l,n 
xxx= (pi+xxx) **5 
ix=int (xxx) 
xxx=xxx- float ( ix) 

60 xx ( iloc+k) =xx ( iloc+k) +xxx 


c call cputime (t4) 

c write (* , * ) ' STARTING ITERATION VECTORS ' 

c do j=l,iq 

c write(*,*) (xx ( ( j - 1) *n+i) , i = l, n) 

c enddo 

c. . . . End Iteratin vectors 


C . . . . BEGIN SUBSPACE ITERATIONS 
c call cputime ( t 5 ) 

nl=n-mod (n, 8) 
iter=0 



u u 


10 iter-iter+1 

C CALCULATE THE PROJECTION OF K AND M 

do 110 j = 1 , iq 

call f be ( n , iu , j u , di , un ,xx( ( j - 1 ) *n+l) , temp, iopfb) 
do 130 i= j , iq 
sum =0 . dO 
do k=l , nl , 8 

sum =sum + xx ( ( i - 1 ) *n+k) *temp (k) + xx ( ( i -1 ) *n+k+l) *temp (k+1) 

4 - +xx( (i-1) *n+k+2 ) *temp(k+2) +xx ( ( i - 1 ) *n+k+3) *temp(k+3) 

+ +xx( (i-1) *n+k+4) *temp (k+4) +xx ( (i-1) *n+k+5) + temp (k+5) 

+ +xx ( (i-1) *n+k+6 ) *temp (k+6) +xx ( (i-1) *n+k+7) + temp (k+7) 

enddo 

do 140 k=nl+l,n 

140 sum =sum + xx ( ( i - 1) *n+k) *temp (k) 

130 xkk ( i , j ) = sum 
do 150 k=l,n 

150 xx ( ( j - 1 ) *n+k) =temp (k) 

110 continue 

IF ( lumas . ne . 1) THEN 
do 160 j =1 , iq 

call multspa2 (n , ik, jk , am, dm, xx ( ( j -1) *n+l) , temp) 
do 180 i=j , iq 
sum=0 . dO 
do k=l , nl , 8 

sum =sum + xx ( { i- 1 ) *n+k) *temp (k) + xx ( (i-1) *n+k+l) *temp (k+1) 

+■ +xx ( (i-1) *n+k+2) *temp (k+2) +xx ( (i-1) *n+k+3) *temp (k+3) 

+ + xx ( (i-1) *n+k+4) *temp (k4-4 ) +xx ( (i-1) *n4-k+5) *temp (k+5) 

+ +xx ( (i-1) *n+k+6) *temp (k+6 ) +xx ( (i-1) *n+k+7) *temp (k+7) 

enddo 

do 190 k=nl+l,n 

190 sum =sum + xx ( ( i- 1 ) *n+k) *temp (k) 

180 xmm(i,j)= sum 

if (iconv.gt.0) go to 160 
do 200 k=l,n 

200 xx ( { j -1) *n+k) =temp (k) 

160 continue 
ELSE 

do 162 j = 1 , iq 
do i=l,n 

temp (i) =xx ( { j -1) *n+i) +dm (i) 
enddo 

do 182 i= j , iq 
sum=0 . dO 
do k=l , nl , 8 

sum =sum + xx { ( i- 1) *n+k) *temp (k) + xx ( { i- 1} *n+k+l) *temp (k+1) 

+ +xx ( (i-1) *n+k+2) *temp (k+2) +xx ( (i-1) *n+k+3) *temp (k+3 ) 

+ +xx ( (i-1) *n+k+4) *temp (k+4) +xx { (i-1) *n+k+5 ) *temp (k+5 ) 

+ +xx ( (i-1) *n+k+6) *temp (k+6) +xx ( (i-1) *n+k+7) *temp (k+7) 

enddo 

do 192 k=nl+l,n 

192 sum =sum + xx ( ( i- 1 ) *n+k) *temp (k) 

182 xmm ( i , j ) = sum 

if (iconv.gt.0) go to 162 
do 202 k=l,n 

202 xx ( { j -1) *n+k) =temp (k) 

162 continue 
ENDIF 


do 131 i=l,iq-l 
do 132 j=i+l,iq 
xmm ( i , j ) =xmm ( j , i ) 
xkk ( i , j ) =xkk ( j , i ) 
132 continue 
131 continue 


c 

write { * , * } 

' REDUCED 

STIFFNESS MATRIX' 

c 

do i=l , iq 



c 

write (* , * ) 

(xkk ( i , j ) 

, j=l,iq) 

c 

enddo 


c 

write (* , * ) 

' REDUCED 

MASS MATRIX' 

c 

do i=l , iq 



c 

write ( * , * ) 

(xmm(i, j ) 

, j-l,iq) 

c 

enddo 



SOLVE FOR EIGENSYSTEM OF SUBSPACE OPERATIONS 
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call j acobi2 (xkk, xmm, phi , eigv, iq, temp, toler , nsmax) 
c write (*,*) ' AFTER JACOBI ' 

c write ( * , * ) ' PHI ' 

c write (* , *) ( (phi (i, j ) , i=l, iq) , j=l, iq) 

c write (*,*) 'EIGV' 

c write ( * , * ) (eigv(i) , i=l, iq) 

c ARRANGE EIGENVALUES AND EIGENVECTORS IN ASCENDING ORDER 

do i = l , iq 
do j=i+l , iq 

if (eigv(i) .gt.eigv(j) ) then 
eigvt=eigv ( j ) 
eigv ( j ) =eigv ( i) 
eigv ( i) =eigvt 
do 220 1=1 , iq 
phit=phi ( 1 , j ) 
phi (1 , j ) =phi (1, i) 

220 phi ( 1 , i ) =phit 

endif 
enddo 
enddo 

c 


c 

c 

c 

c 

c 


write { * , * ) 
write {* ,*) 
write ( + , *) 
write ( * , * ) 
write ( * , * ) 


'AFTER ARRANGING IN ASCENDING ORDER ' 
' PHI ' 

( (phi (i, j ) , i=l, iq) , j=l, iq) 

' EIGV' 

(eigv(i) , i=l, iq) 


c IMPROVED APPROXIMATION TO THE EIGENVECTORS 

c Xk+1 =Xk+l + Qk+1 

call matmat3 (xx, phi f n, iq, temp) 
c write { * , * ) ' IMPROVED EIGENVECTORS ' 

c write(*,*) (xx (i) , i=l, n*iq) 

c CHECK FOR CONVERGENCE OF EIGENVALUES 

c call cputime { 1 6 ) 

if (iconv.gt.O) go to 500 
do 400 1=1 , iq 

dif f =dabs (eigv ( i) -eigvO (i) ) 

400 tol j ( i) =dif f /eigv ( i) 
do 410 i=l,neig 
if (tol j (i) .gt . toler) go to 440 
410 continue 

iconv=l 
go to 10 

440 if (iter . It . itertot) go to 420 
iconv=2 

write (*,*) 'The number of iterations is larger than the allowed 
+ iter', iter, itertot 
go to 10 

420 do 430 i=l , iq 

430 eigvO ( i) =eigv ( i) 

go to 10 

c End of iteration 

500 continue 

call cputime (t7) 


c500 

write (23 , * ) 
write (23 , * ) 
write (23, * ) 
write (23, * ) 

' RESULTS FOR EIGENSOLUTION' 

' RESULTS FOR EIGENSOLUTION' 

/ / 

c 



write (23, * ) 

'Number of iterations =',iter-l 

c 

write (23, * ) 


c 

write (23, * ) 

' Eigenvalues ' 

c 

write (23 , * ) 

(eigv(i) ,i=l,neig) 

c 

write (23 , * ) 


c 

write (23, * ) 

' Eigenvectors ' 

c 

do 510 j = 1 , 

neig 

c510 

write (23, * ) 

(xx ( ( j -1) *n+i) , i=i, n) 


write (23 * ) 

write (23 ',*) 'TOLERANCE CHECK ON EIGENVALUES ' 
write (23,*) ' * # * , * EIGV * , * TOLJ* ' 


if ( ishif t . ne . 0) then 

do i=l,neig 

write (23 , +) ' ' , i , ' 

enddo 


' , eigv (i) -float (ishif t) , 

7G 


tolj (i) 



n o 


c 

c 

c 

c 


else 

do i=l,neig 
write (23, * ) 
enddo 
endif 


i , ' ' , eigv (i) , 


tolj (i) 


write {23, * ) 
write (23 , *) 
write (23, * ) 
write (23, * ) 

write (23 , *) 
write { 23 , * ) 


Timing ' 

Time Init + Start. Iter. Vect. 
+(t4-t3) 

Time subspace Iter. 

Time subspace Iter. 


ERROR NORM CALCULATION : | |Kx-eigv.Mx| |/| |Kx| | 


(t2-tl) 

t6 - 15 
t7-tl 


call normcheck ( iq, n , neig, ncoef , ik, jk , ak, dk, dm, am, 
+ eigv, xx, temp , eigvO , lumas , tolj , i shift) 


return 

end 
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c******************************************************-**************-******* 

C CONSISTANT MASS CASE + SHIFT FOR REGULAR LANCZOS 

c 

c H. Runesha dec 22, 1997 

c 

£***★ + *★* + ★***★*★*********** + + + + + *** + + *★***★*★*****★***★**★★***-*■**★****★★★ + ★ 

C SUBROUTINE SP2LAN2 (N, LANMAX , NEIG , UN, DI , IU, JU, W, wl , w2 , w4 , 

c $TEM , EIG, ERR, VEC, ncof f , ncof 2 , dmass ,Q, lump, ishif t 

c + am, ia , j a) 

SUBROUTINE SP2LAN2 (N, LANMAX , NEIG , UN, DI , IU, JU, W, wl , w2 , w4 , 

$TEM, EIG, ERR, VEC , ncof f , ncof 2 , dmass , Q, 

+am, lump , ishif t , ia , j a) 


implicit real*8 (a-h,o-z) 

REAL* 8 W(l) , VECdanmax, lanmax) ,EIG(1) ,W4{1) ,W2(1) ,Q(N, LANMAX) 
REAL* 8 TEM(l) ,ERR(l) , wl ( 1 ) , dmass {1) , UN (1) , DI (1) , am(l) 

INTEGER IU(1) ,JU(1) ,ia(l) , ja(l) 

COMMON /Q IN/ QLAN , EPSOO 
common/ sp2com/nbuf ( 4 ) 

rewind (10) 

read(10) (dmass ( i ), i = l , n) 
c if (neig . ne . 0 . and . lump . ne . 1) then 

rewind ( 17 ) 

read(17) (am ( i) , i = l , ncof f ) 
c write(*,*) (am ( i ) , i = l , ncof f ) 

c endif 


c j 1 ai besoin de IA et JA pour le multiplication de la mass 

c les valeurs ont ete detruites pendant la factorization 

c essayer de ne pas les relire pendant le normcheck 

rewind ( 15 ) 


c 

c 

read(l5) ( ia ( i ) , i=l , 1+n) 
rewind (11) 

read(ll) ( ja (i) , i = l , ncof f ) 
write ( * , * ) ' 

c 

write ( * 

, *) 


c 

write ( * 

, *) 

' Inside Spalan ' 

c 

write ( * 

y * ) 

1 Lump = ' , lump 

c 

write ( * 

y*) 

' N, LANMAX, NEIG, ncof f , ncof 2 ' 

c 

write ( * 

y *) 

N, LANMAX, NEIG, ncof f, ncof 2 

c 

write ( * 

,*) 

'DM = ; , (dmass (i) , i=l , n) 

c 

write ( * 

y *) 

'AM =' , (am(i), i=l,ncoff) 

c 

write ( * 

y *) 

' IA =' , (ia (i) , i = l , 1+n) 

c 

write ( * 

y *) 

' JA =' , ( ja (i) , i=l , ncof f ) 

c 

write ( * 

, * ) 


c 

write ( * 

y*) 

' 

c 

write ( * , 

*) ' 

' $$$ UN =' , (un(i) y i=ly 15) 


iam=0 
nodes=l 
NBLK = 1 

RTOL = 0.000000000 ldO 
NNNN= 4 *NEIG 

IF (NNNN.GE .LANMAX) NNNN=LANMAX-1 
MN=0 

DO 61 1=1, LANMAX 
EIG (I) =0 . OdO 
61 ERR { I ) =0 . OdO 

EPS=GETEPS (IBETA, IT, IRND) 

EPSOO=EPS 

EPS=dsqrt (EPSOO) 

EPSb=EPS* 5 . OdO 
wnor=0 . OdO 
k=l 

do 14 i=l,n 
w ( i ) =1 . OdO 

c if (k.gt . 100) k = 1 

14 wnor=wnor+w ( i) *w (i) 
wnor=l . OdO/dsqrt (wnor) 

DO 15 1=1, N 

W(I) =w(i) *wnor 
Wl (i) =0 . OdO 

15 CONTINUE 



1501 CONTINUE 
C do 9901 i=l,n 

c w2 (i) =dmass (i) *w ( i) 

c990 1 continue 

call multspa2 (n, ia , ja, am, dmass , w, w2 ) 
bet=0 . OdO 
do 99011 I = 1 , N 

99011 bet=bet+w2 { i) *w (i) 
bet=dsqrt (bet) 

do 99012 1=1, N 
W1 (i) =w(i) /BET 
w ( i ) =0 . OdO 
0(1,1) = Wi(i) 

99012 continue 

c do 99013 1=1, n 

c99013 w2 ( i) =dmass (i) *W1 (i) 

call multspa2 (n, ia , ja, am, dmass , wl , w2 ) 

DO 50 JJ=1 , LANMAX-1 
DO 1001 1=1, N 
1001 W4 ( I ) =W2 ( I ) 

call f be (n, iu, ju, di , un, w2 , w4 , iopfb) 

DO 55 1=1, N 
W4 (I) =W4 (I) -W(I) *BET 
55 CONTINUE 

ALF=DDOT (N, W2 , 1 , W4 , 1) 

DO 60 1=1, N 

W4 ( I ) =W4 ( I ) - ALF*W1 ( I ) 

60 CONTINUE 

c do 9902 i=l,n 

c9902 w2 (i) =dmass (i) *w4 (i) 

call multspa2 (n, ia, ja, am, dmass , w4 , w2 ) 

BET2 =DDOT (N , W4 , 1 , W2 , 1 ) 
bet2=dsqrt (bet2) 

c write (*,*)' before call REORTH : JJ = ' ,jj 

CALL REORTH (N,W2, 

&W4 , Q , EPS , LANMAX , TEM , J J , BET2 ) 
c do 9903 i=l,n 

c9 903 w2 ( i ) =dmass ( i) *w4 ( i) 

call multspa2 (n, ia, ja, am, dmass , w4 , w2) 

BET2 =DDOT ( N , W4 , 1, W2,l) 

c write (*,*) ' before 100 : JJ,beta = ',jj,beta 

bet2=dsqrt (bet2) 

100 CONTINUE 

DBET2=1 . OdO /BET2 
DO 110 1=1, N 
W(I) =W1 (I) 

Wl (I) =W4 (I) *DBET2 
Q(I, JJ+1) =W1 (I) 

W2 (I) =W2 (I) *DBET2 
110 CONTINUE 

EIG (JJ) =ALF 
ERR ( JJ) =BET2 
BET=BET2 

c write ( * , * ) ' JJ , ALF , 3ET2 , iam= ' , j j , alf , bet 2 

IF (JJ.LT.NNNN) GO TO 50 
c if (nodes . gt . 1) call mp_sync (nbuf (4) ) 

c write (*,*) ' ishift =' , ishift 

CALL JACOB I P2 (N, NEIG, JJ, BET2 , W ( 1) ,Q,VEC,EIG, 

$ERR , UN , TEM , RTOL , LANMAX , DMASS , AM , 

$W1 , N+l , NNNN, NSUC, ncof f , ncof 2 , iu, ju, di, ishift) 

IF (NSUC.EQ.l) GO TO 2211 
50 CONTINUE 

120 CONTINUE 

CALL JACOB I P2 (N, NEIG, JJ- 1 , BET2 , W (1) ,Q,VEC,EIG, 

$ ERR , UN , TEM , RTOL , LANMAX , DMAS S , AM , 

$W1, N+l, NNNN, NSUC, ncof f, ncof 2, iu, ju,DI, ishift) 

2211 CONTINUE 

RETURN 

END 

c 

£★★*★★**★★★* + ***★** + +★**+★*★★**★★ + ★* + + *★★*+++*+*+ + ★***★★*★***★++**** + +*** 

c 


SUBROUTINE JACOB I P2 (N, NEIG, NJ, BET2 , V, Q, VEC , 

$ EIG, ERR, AN, HH, RTOL, LANMAX, DMASS, AM, 

$ VP1,NP1 , NNNN, NSUC, ncof f , ncof 2 , ia, ja,ad, ishift) 


implicit real*8 (a-h,o-z) 
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2 


1 

c 

c 


21 

c 

20 

C 

30 

c 

c 

C 

c 


c 

c 

c 

c 

c 

c 

c 


c 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c510 


c 

c 

c 

c 


REAL* 8 DMASS (1) , Q (N, LANMAX) , VEC ( lanmax, lanmax) 

$ , EIG ( 1) , ERR (1) , HH (1) , VP1 (1) , V (1) , AN (1) , ad (1) , AM (1) 
integer ia(l),ja(l) 
common/ sp2com/nbuf (4) 

COMMON /QIN/ QLAN, EPS 

COMMON /NQIN/ NQL 

COMMON /QJA/ SUB (1000) 

iam=0 

nodes=l 

NSUC=0 

DO 1 1=1, LANMAX 
DO 2 J=1 , LANMAX 
VEC ( I , J) = 0 . OdO 
HH ( I ) =EIG ( I ) 

SUB (I) =ERR (I) 

VEC (I, I) =1 . OdO 
CONTINUE 

if (nodes . gt . 1) call mp_sync (nbuf (4 ) ) 

CALL TQL2 (LANMAX, NJ, EIG, ERR, VEC, IRR, EPS ) 
write {*,*)' IN JACOBQ: TQL2 is done 1!!! ' 

DO 20 1=1, NJ 
err (i) =0 . OdO 
do 21 j=l,nj 

err (i) =err ( i) +abs (vec ( j , i) ) 

ERR (I) =err (i) *0 . 0 0 5d0*ABS (BET2*VEC (NJ, I) /EIG (I) ) 
write (* , *) ' I , err (i) , vec (nj , i) =' , i, err (i) , vec (nj , i) 
CONTINUE 
DO 30 1=1 , NEIG 

IF (ERR (I) .GT. 0.0000 01. AND.NJ.LT. (LANMAX- 1) ) GO TO 3 00 
IF (ERR (I) .GT. RTOL .AND. NJ . LT . (LANMAX-1) ) GO TO 300 
CONTINUE 

if (nodes . gt . 1) call mp_sync (nbuf (4 ) ) 

CALL VECTRA(N, NEIG, 0, VEC, Q, LANMAX, NJ,V) 
write (*,*)' IN JACOBQ: VECTRA is done !!!! ' 

NSUC : =l,succesful 
NSUC= 1 

DO 580 L= 1 , NEIG 
call cputime (tOl) 
ninc=8*nodes 
do L=l,neig 
do i=n-nadd+l,n 
q ( i , L) =0 . OdO 
enddo 
enddo 

write (*,*)' NADD = ',nadd,n 

endif 

rewind ( 15 ) 

read(15) { ia ( i ) , i=l , 1+n) 
rewind (11) 

read(ll) ( ja ( i) , i=l, ncof f ) 
rewind (12 ) 

read(l2) (an ( i) , i=l , ncof f ) 
rewind (13 ) 

read(13) (ad (i) , i=l , n) 

write(*,*) '* + pi*+ AD=' , (ad (i) , i = l , n) 


write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
write ( * , * ) 
do 510 j= 


' lanmax, n, neig' 
lanmax, n, neig 


IA 

JA = 
AN = 
AD = 
DM = 
EIG= 
1 , neig 


(ia (i) , i=l , 1+n) 

( ja (i) , i=l , ncof f ) 
(an (i) , i=l , ncof f ) 
( ad ( i ) , i=l , n) 
(dmass (i) , i=l , n) 
(eig (i) , i=l, iq) 


write (*,*) (q(i,j) ,i = l,n) 


NORMCHECK2 ' 


DO 580 L=l,neig 

call multspa (n, ia, ja, an, ad, q, v) 
call mulmeiko (n, , , , ) 
if (nine . gt . 8 ) call mp_sync (nbuf (4) ) 
call sp2mul (iam,ninc,n, icolg,maxa, stif ,q(l,L) , vpl, v) 
call multspa (n, ia, ja, an, ad, q (1, L) , v) 
write (*,*)' Q(i,L) = f ,L,(q(i,L) ,i=l,n) 

VNORM= 0 . 0d0 
DO 590 1=1, N 

So 


c 



590 VNORM=VNORM+V ( I ) *V(I) 

WNORM=0 . OdO 
c do 2239 i=l , n 

c vpl (I ) =dmass (i) *Q (i , L) 

c223 9 continue 

call mult spa (n, ia, ja , am, dmass , q ( 1 , L) , vpl) 
c write(*,*) ' I , vpl=', i, (vpl ( ii) , ii = l , n) 

RT= 1 . OdO /EIG { L) 

DO 600 1=1, N 

c erm=max (abs (v (i) ) , abs (rt*vpl (i) ) ) 

c error=abs {v ( i) -rt*vpl (i) ) /erm 

c if (error . gt . 0 . 95d0 . and. erm. gt . 0 . OOOOldO ) then 

c write (*,*)'L,I,V(I) , ,=',L,I,v(i) , rt*vpl (i) , error, q ( i , L) 

c endif 

V(I) =V(I) - RT*VP1 (I) 

600 continue 

c write!*,*) / V(I)= , I (v(i),i=l,n) 

do 601 i = l,n 

6 01 WNORM=WNORM+V (I) *V(I) 

c if (iam. eq. 0) write (*,*) 'VNORM, WNORM = ' , vnorm, wnorm, L 

VNORM=DSQRT (VNORM) 

WNORM=DSQRT (WNORM) 

HH (L) = WNORM /VNORM 
58 0 CONTINUE 

c write(*,*) ' HH =' , (hh ( i) , i=l , neig) 

call cputime (t02) 
t 02=t02 - tOl 

WRITE (23 , *)' *** K, * EIG* , *HERTZ *,* ERROR *,* NORM *** iam' 

if ( ishif t . eq . 0) then 
DO 700 K= 1 , NEIG 

c HERTZ = 1 . 0/ (2.0*3 . 1415 92 7*DSQRT (EIG (K) ) ) 

HERTZ = 1 . 0/ (2 . 0*3 . 14 1 5 92 7 *DSQRT (DABS (EIG (K) ) ) ) 

WRITE (23, 701) K, 1.0/EIG(K) , HERTZ , ERR (K) , HH (K) 

701 FORMAT ( 2X ,I5,2X,4E15.7) 

700 CONTINUE 

else 

DO 705 K=1 , NEIG 
EIG (K) =1 . /EIG (K) -float ( ishif t) 
c HERTZ=DSQRT (EIG (K) ) / (2 . 0*3 . 1415 92 7) 

HERTZ=DSQRT (DABS (EIG (K) ) ) / (2.0*3 . 1415927) 
c WRITE (23,701)K,1. 0 /EIG (K) , HERTZ , ERR (K) , HH (K) 

WRITE (23, 701) K, EIG (K) , HERTZ , ERR (K) , HH (K) 

705 CONTINUE 

endif 

NSUC=1 

if (iam. eq. 0) write (23 , *) 'JACOBIQ: Steps in IAM = ',nj,iam 
RETURN 

3 00 NNNN=MIN0 ( LANMAX- 1 , NJ+ 3 * (NEIG- I) +4) 

NSUC=0 

DO 304 1 = 1, LANMAX 
EIG ( I ) =HH ( I ) 

304 ERR { I ) =SUB ( I ) 

RETURN 

END 

c 

£★★ + ★*★** + ★ + ★★★* + ★★*** + + * + ★* + *★★ + ★***★★*★*★★ + + ★ + ★***•*★*****’*■★★★★★★ + * 

c 


subroutine REORD2 (n,ncoff , nreord, mtota, mtoti , a, iq,neig) 
real*8 a(l) 
c 

c This is an additional routine to reorder as well the 

c consistant mass 

c 
c 

c note personel 

c si je met un IF LUM en dehors, il faut commenter celui-ci 

c 

integer IQ(1) 

if (2*ncoff . gt .mtota) then 

write ( * , * ) ' REORD : increase MTOTA to: 2*ncoff = ',2*ncoff 
stop 
endif 

if (3*ncof f +7*n+5 .gt .mtoti) then 





no no non u> ooooo 


write ( * , * ) ' REORD : increase MTOTI to: 3+ncof f +7*n+5 = 

1 3*ncof f +7*n+5 
stop 
endif 

if (nreord . ne . 0) then 
call cputime ( timeO) 

call ireadO (n, ncoff , iq(l) , iq(l+n) , iq(2*n+l) , iq (3*n+2) , 

1 iq(4*n+3) , iq(5*n+4) , iq (5*n+4+ncof f ) ) 
call cputime (timel) 
rewind ( 18 ) 

write (18) ( iq ( i ) , i=5*n+4+ncof f , 5*n+3 +3 *ncof f ) 

call genmmd (n , iq(4*n+3) , iq ( 5*n+4+ncof f ) , iq(i+n) , 

4 iq (6*n+5+3*ncof f ) ,iq(2*n+l) , 

3 iq { 3*n+2 ) , iq ( 1 ) , iq (5*n+4+3*ncof f ) , nof sub, maxcon, nterms) 
rewind { 18 ) 

read (18) (iq (i) , i=5*n+4+ncof f , 5*n+3+3*ncof f ) 
call cputime (time2) 

call getnewk (n, iq (1) , iq ( 1+n) , iq (2*n+l) f iq (3*n+2) , iq (4*n+3) r 

3 iq ( 5*n+4+ncof f ) , iq ( 5*n+4) ) 

call copyk (n, iq (1) , iq(l+n) , iq (2*n+l) , iq (5*n+4) , iq (5*n+4+ 

4 ncof f ) , a (1) , a (1+ncof f ) , ncof f ) 

pierrot 

IF ( LUMP . NE . 1 ) THEN 

call copyk2 (n, iq (1) , iq(l+n) , iq(2*n+l) , iq(5*n+4) , iq(5*n+4+ 

4 ncoff) , a {l+3*ncof f ) , a (l+2*ncof f ) , ncoff) 

4 ncof f) , a (1) , a (1+ncof f) , ncof f) 

ENDIF 

pierrot 

call copydb (n, iq (1+n) , a (1) , a (1+n) , a (2*n+l) , a (3*n+l) , neig) 

988 continue 

write (23 ,*)' CPU to get MD reordering = / , time2 -timel 

endif 

return 

end 

★ ★★★★★it***********************************************-************** 

subroutine copyk2 (n, ia, perm, iu, ju, ja, un, am, ncof f) 
integer ia (1) , iu (1) , ja (1) , ju (1) , perm(l) 
real*8 am(l) / un(l) 

do i=l, ncoff 
un ( i ) = 0 . 
am ( i ) =0 . 
enddo 

rewind ( 11 ) 

read(ll) ( j a ( i ) ,i=l, ncoff) 
rewind ( 17 ) 

read(17) (am(i) , i=l, ncoff) 
rewind ( 15 ) 

read (15) (ia(i) ,i=l,n) 


do 200 I = 1 , n-1 
10 = perm(i) 

do 220 J = iu ( I) , iu (1+1) - 1 

JO = perm{ ju ( J) ) 
ijO = iO 
i j 00 = jO 
if(jO.LT.iO) then 
ijO = jO 
i j 00 = iO 
endif 

CDIR$ IVDEP 

do 230 jj = ia (ij 0) , ia (i j0+l) -1 
if ( ja ( j j ) .NE. ijOO) go to 230 
un { J) = am ( JJ) 
go to 220 
continue 
continue 
continue 
rewind ( 17 ) 

write(17) (un (i) , i = l, ncof f ) 


230 

220 

200 


8 Z 



no non o o o o o oonooooo on 


c 

write (* 

*) 

' PERM 

i 

— / 

(perm(i) , i = i, 10 ) 

c 

write ( * 

*) 

' IA 

= ' , 

( IA (I) , i=l, 10) 

c 

write ( * 

*) 

' JA 

_ > 

— / 

(JA(i) , i=l , 10) 

c 

write ( + 

*) 

' IU 

__ f 
- / 

( IU ( I ) , i-l , 10) 

c 

write ( * 

*) 

' JU 

= ' , 

( JU (i) , i-l ,10) 

c 

write { * 

*) 

; AM 

= ' , 

(am(i) , i = 1 , 1 0 ) 

c 

write (* 

*) 

; UN 

= * 

(un (i) , i-l, io) 


return 

end 


c 


subroutine RE0RD3 (n, ncof f , nreord, mtota, mtoti , a, iq, neig) 
real*8 a(l) 

This is an additional routine to reorder as well the 
consistant mass 


note personel 

si je met un IF LUM en dehors, il faut commenter ceLui-ci 

integer IQ ( 1 > 

if (2*ncoff . gt . mtota) then 

write (*,*)' REORD : increase MTOTA tor 2*ncoff = ' , 2+ncof f 

stop 

endif 

if ( 3 *ncof f + 7*n+5 .gt .mtoti) then 

write (*, *) ' REORD : increase MTOTI to: 3 *ncof f +7*n-h5 = 

1 3 *ncof f +7*n+5 
stop 
endif 

if (nreord . ne . 0) then 
call cputime (timeO) 

call ireadO {n,ncof f , iq(l) , iq(l+n) , iq(2*n+l) , iq(3*n+2) , 

1 iq (4*n+3 } , iq { 5*n+4) , iq ( 5*n+4+ncof f ) ) 
call cputime ( time 1) 
rewind (18 ) 

write (18) < iq ( i ) , i=5*n+4+ncof f , 5*n+3+3*ncof f ) 

call genmmd (n, iq (4*n+3 ) , iq ( 5*n+4+ncof f ) , iq (1+n) , 

4 iq (6*n+5+ 3*ncof f ) , iq(2*n+l) , 

3 iq(3*n+2) , iq ( 1 ) , iq { 5 *n+4+3*ncof f ) , nof sub, maxcon, nterms) 
rewind { 18 ) 

read(18) ( iq ( i ) , i = 5*n+4+ncof f , 5*n+3+3*ncof f ) 
call cputime (time2 ) 

call getnewk (n, iq { 1) , iq ( 1+n) , iq (2*n+l) , iq (3*n+2) , iq (4*n+3 ) , 

3 iq ( 5*n+4+ncof f ) , iq (5*n+4) ) 

call copyk (n, iq (1) , iq(l+n) , iq(2+n+l) , iq(5*n+4) , iq(S^n+4^ 

4 ncof f ) , a ( 1) , a (1+ncoff ) , ncof f ) 

pierrot 

IF ( LUMP . NE . 1 ) THEN 

call copyk 3 (n, iq(l) , iq(l+n) , iq(2*n+l) ,iq(5*n+4) , iq(5*n+4+ 

4 ncoff) , a ( l+3*ncof f ) , a (l+2*ncof f ) ,ncoff) 

4 ncoff) ,a(l) ,a(l+ncoff) ,ncoff) 

ENDIF 

pierrot 

call copydb (n, iq ( 1+n) , a (1) , a (1+n) , a (2+n+l) , a (3*n+l) ,neig) 

988 continue 

write (23 ,*)' CPU to get MD reordering = # , time2 - timel 

endif 

return 

end 

★ ★*★*****★*★ + ** + *★*★**★★ + + + ***■★-*★** + **★**★★**★★**■*• + *★**★***★★★★★**** 

subroutine copyk3 (n, ia, perm, iu, ju, ja, un, am, ncof f ) 
integer ia(l) ,iu(l) ,ja(l) ,ju(l) , perm (1) 
real*8 am ( 1) , un (1) 

do i=l, ncoff 

un (i) =0 . 83 



c 

c 

c 

c 


c 

c 


do 200 I = 1 , 11-1 
10 = perm(i) 

do 220 J = iu(I) , iu(I + l) - 1 
JO = perm ( ju ( J) ) 
ijO = iO 
i j 00 = jO 
if (jO. LT . iO ) then 
i j 0 = jO 
i j 00 = iO 
endif 

CDIR$ IVDEP 

do 230 jj = ia ( i j 0) , ia (i j 0+1) - 1 
if ( j a ( j j ) .NE . i j 00) go to 230 
un{J) = am{JJ) 
go to 220 

230 continue 

220 continue 

200 continue 
rewind (17) 

write(17) (un(i) ,i=l,ncoff) 


c 

write ( * , * ) 

' PERM =' , 

(perm(i) , i=l, 10) 

c 

write ( * , * ) 

' IA 

( IA ( I ) , i=l, 10) 

c 

write ( * , * ) 

' JA 

(JA(i) , i=l, 10) 

c 

write ( * , * ) 

' IU 

{ IU ( I ) , i=l, 10) 

c 

write ( * , * ) 

' JXJ 

( JU ( i ) , i=l, 10) 

c 

write (* , * ) 

' AM 

(am ( i ) , i=l , 10 ) 

c 

write ( * , * ) 

' UN 

(un (i) , i = l, 10) 


return 

end 


c 

c 

c 


am(i) =0. 
enddo 

rewind (11 ) 

read (11) (ja(i) ,i=l,ncoff) 
rewind ( 17 } 

read(17) (am (i) , i = l, ncoff ) 
rewind (15) 

read (15) (ia (i) , i=l,n) 


S 4 



# 

EXEC 

OBJ 


SOURCE 


F77 

#F77 

FFLAGS= 

#FFLAGS 

# FFLAGS 

#DEBUG 

.f .o: 


$ (EXEC) 


source : 


= a . out 

= spamain.o spaldln.o reord.o meikolan.o blanlib.o \ 

j acobi2 . o matmat3.o multspa2.o normcheck.o spasubspaceN . o\ 
meikolan2.o reord2.o reord3.o 
= spamain.f spaldln.f reord.f meikolan.f blanlib.f \ 

jacobi2.f matmat3.f multspa2.f normcheck.f spasubspaceN. f\ 
meikolan2.f reord2 . f reord3 . f 
= f 7 7 

= /usr/local/lang/f 77 
-Mperf 

: -04 -Knoieee -Mvect 
= “9 

$ ( F77 ) $ (DEBUG) $ (FFLAGS) -c $< 

: $ (OBJ) 

$ ( F7 7 ) -o $ (EXEC) $ (OBJ) $ (FFLAGS) 


cat $ (SOURCE) > source . f 


85 - 



1234567890 

1234567890 

1875.0000000000 87890.625000001 

0.56250000000000 750.00000000000 

131.83593750000 1 . 4423076923077D 

3749.9999999999 175781.25000000 

1.1250000000000 1500.0000000000 

131.83593750000 1 . 4423076923077D 

3749.9999999999 175781.25000000 

1.1250000000000 1500.0000000000 

131.83593750000 1 . 4423076923077D 

3749.9999999999 175781.24999999 

1.1250000000000 1500.0000000000 

131.83717347736 1 . 4423121995474D 

3750.0117188233 175782.89796982 

1.1250035156470 1500.0046875293 

131.83593750000 1 . 4423076923077D 

1874.9999999999 87890.624999994 

0.56249999999998 749.99999999998 


65 . 917968750000 7 . 2115384615384D-03 

3750.0000000000 175781.25000000 

-02 1.1250000000000 1500.0000000000 

131.83593750000 1 . 4423076923077D-02 

3749.9999999999 175781.25000000 

-02 1.1250000000000 1500.0000000000 

131.83593750000 1 . 4423076923077D-02 

3749.9999999999 175781.25000000 

-02 1.1250000000000 1500 . 0000QQ0Q00 

131.83593749999 1 . 4423076 923 077D-02 

3750.0117188232 175782.89796981 

-02 1.1250035156470 1500.0046875293 

131 . 83717347737 1 . 4423121995474D-02 

3750.0000000000 175781.25000000 

-02 1.1250000000000 1500.0000000000 

65 . 917968749995 7 . 21153 84615382D- 03 


1 . 

2 . 3 

4 . 

5 . 

6 . 

7 . 

03 

00 

10 . 



11 . 

12 . 
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16 . 
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22 . 

23 . 

24 . 

25 . 

26 . 

27. 

28 . 

29. 

30 

31. 

32 . 

33 . 

34 . 

35 . 

36 . 

37. 

38 . 

39. 

40 

41 . 

42 . 

43 . 

44 . 

45 . 

46 . 

47. 

48 . 

49 . 

50 

51 . 

52 . 

53 . 

54 . 

55 . 

56 . 

57. 

58 . 

59 . 

60 

61 . 

62 . 

63 . 

64 . 

65 . 

66 . 






5. 866399999999 9D -09 5 . 8663999999999D-09 5 . 8663999999999D-09 0. 0. 

0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 0. 

0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 17328000000QOD-08 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 1732763335000D-08 1 . 1732763335000D-08 1 . 173276333500QD-08 

0. 0. 0. 1. 1732763335000D-08 1 . 1732763335000D-08 1.17327633350000-08 

0. 0. 0. 1. 1732800000000D-08 1 . 1732800000000D-08 1 . 17328000000000-08 

0. 0. 0. 5 . 8664 000000001D-09 5 . 8664000000001D-09 5.86640000000010-09 


0 . 0 . 0 . 


66*1.0 

Testing Agoura Hill's small eigen data !! 

3 10 1 66 66 114 0 +1000000.0 0 -1 

nreord, neig, lump, neq, neq, ncoef, itime, ishift, iblock, mread 
13312213312212212212212212 
21221221221331221331221331 
22133122011000 

5 . 8663 999999999D- 09 5 . 8663999999999D-09 5 . 8663999999999D-09 0. 0. 

0. 1 . 1732800000000D-08 1.1732800000000D-08 1 . 1732800000000D-08 0. 

0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-Q8 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1. 173280000000 0D -08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 1732 800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1. 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 1 . 173276333 5000D- 08 1 . 1732763335000D-08 1.17327633350000-08 

0. 0. 0. 1 . 1732763335000D-08 1 . 1732763335000D-08 1 . 1732763335000D-08 

0. 0. 0. 1 . 1732800000000D-08 1 . 1732800000000D-08 1 . 1732800000000D-08 

0. 0. 0. 5 . 8664 000000001D-09 5 . 8664000000001D-09 5 . 8664000000001D-09 

0. 0. 0. 

-1875.0000000000 7031.2500000001 -87890.625000001 7031.2500000001 

-5.2734375000000 -65.917968750000 -5.2734375000000 - 7 . 21153846153 84D-03 

5.2734375000000 0.28125000000000 -7031.2500000001 375.00000000000 

-1875.0000000000 - 8 . 73114913 70201D- 11 -87890.624999999 7031.2500000000 

8 . 52651282 912 12D- 14 -65.917968750000 -5.2734 3 74999999 - 7 . 2115384615383D- 03 

5.2734374999999 0.28125000000000 -7031.2500000000 375.00000000000 

-1875.0000000000 -87890.624999999 7031.2500000000 -65.917968750000 

-5.2734374999999 -7 . 2115384615383D-03 5.2734374999999 0.28125000000000 

-7031.2500000000 375.00000000000 -1875.0000000000 -87890.624999999 

7031.2500000000 -65.917968750000 -5.2734374999999 -7 . 2115384615383D-03 

5.2734374999999 0.28125000000000 -7031.2500000000 375.00000000000 

-1875.0000000000 -87890.624999999 7031.2500000000 -65.917968750000 

-5.2734374999999 -7 . 2115384615383D-03 5.2734374999999 0.28125000000000 

-7031.2500000000 375.00000000000 -1875.0000000000 -87890.624999999 

7031.2500000000 -65.917968750000 -5.2734374999999 -7 . 2115384615383D-03 

5.2734374999999 0.28125000000000 -7031.2500000000 375.00000000000 

-1874.9999999999 -2 . 9103830456734D-10 -87890.624999994 7031.2499999997 

2 .2737367544323D-13 -65.917968749995 -5.2734374999997 -7 . 211538461S3 82D- 03 

5.2734374999997 0.28124999999999 -7031.2499999997 375.00000000000 

-1875.0117188232 8 . 7891449307790D-02 -87892.272969819 7031.3378914490 

-6 . 5918587040414D-05 -65.919204727364 -5.2735034185868 -7 . 2115835339355D-03 

n 



5 .2735034185868 
-1875 . 0000000000 
6 . 5 918 58 6 58 566 7D- 
5 .2734375000002 
-1874 . 9999999999 
4 . 5474735088646D- 
5.2734374999997 
-7031.2499999997 


0.28125175782348 -7031.3378914490 375.00234376465 

-8 . 7 8 91448 72 5 7 13D- 02 -87890.625000005 7031.2500000003 

05 -65.917968750004 -5.2734375000002 - 7 . 21153 846 153 8 5D- 03 

0.28125000000000 -7031.2500000003 375.00000000001 

-5. 820766091346 7D -10 -87890.624999994 7031.2499999997 

13 -65.917968749995 -5.2734374999997 - 7 . 2 115 3 846 1 5 382D- 03 

0.28124999999999 -7031.2499999997 375.00000000000 

5 . 2734374999997 


114*0 . 0 
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